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Description 

Isolation of the biosynthesis genes for pseudo-oligosaccharides from 
Streptomyces glaucescens GLA.O, and their use 

5 

The present invention relates to the isolation of genes which encode 
enzymes for the biosynthesis of a-amylase inhibitors, so-called pseudo- 
oligosaccharides. The genes concerned are, in particular, genes from the 
Streptomycetes strain Streptomyces glaucescens GLA.O (DSM 40716). In 

10 addition, this present patent describes the use of these genes for 
producing acarbose and homologous substances with the aid of 
Streptomyces glaucescens GLA.O, the heterologous expression of these 
genes in other strains which produce pseudo-oligosaccharides (e.g. 
Actinoplanes sp SE50/100) for the purpose of increasing and stabilizing 

15 production, and also their heterologous expression in other 
microorganisms such as E. coli, Bacillus subtilis, Actinomycetales, such as 
Streptomyces, Actinoplanes, Ampullariella and Streptoporangium strains, 
Streptomyces hygroscopicus var. limoneus and Streptomyces 
glaucescens, and also biotechnologically relevant fungi (e.g. Aspergillus 

20 niger and Penicillium chrysogenum) and yeasts (e.g. Saccharomyces 
cerevisiae). The invention also relates to homologous genes in other 
microorganisms and to methods for isolating them. 

Streptomyces glaucescens GLA.O produces the two antibiotics 
25 hydroxystreptomycin (Hutter (1967) Systematik der Streptomyceten 
(Taxonomy of the Streptomycetes). Basel, Karger Veriag) and 
tetracenomycin (Weber et al. (1979) Arch. Microbiol. 121: 111-116). It is 
known that streptomycetes are able to synthesize structurally varied natural 
products. However, the conditions under which these compounds are 
30 produced are frequently unknown, or else the substances are only 
produced in very small quantities and not detected. 

The a-amylase inhibitor acarbose has been isolated from a variety of 
Actinoplanes strains (SE50, SE82 and SE18) (Schmidt et al. (1977) 
35 Naturwissenschaften 64: 535-536). This active substance was discovered 
in association with screening for a-amylase inhibitors from organisms of 
the genera Actinoplanes, Ampullariella and Streptosporangium. Acarbose 
is pseudotetrasaccharide which is composed of an unusual unsaturated 
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cyclitol unit to which an amino sugar, i.e. 4,6-dideoxy-4-amino-D- 
glucopyranose, is bonded. Additional a-1 ,4-giycosidically linked 
D-gfucopyranose units can be bonded to the amino sugar. Thus, acarbose, 
for example, contains two further molecules of D-glucose, The producing 
5 strain synthesizes a mixture of pseudo-oligosaccharide products which 
possess sugar side chains of different lengths (Schmidt et al. (1977) 
Naturwissenschaften 64: 535-536). The acarbose cyclitol residue is 
identical to the compound valienamine, which is a component of the 
antibiotic validamycin A (Iwasa et al. (1979) J. Antibiot. 32: 595-602) from 
1 0 Streptomyces hygroscopicus var. limoneus. 

Q Acarbose can be produced by fermentation using an Actinoplanes strain 

■™ and has achieved great economic importance as a therapeutic agent for 

m diabetics. While Actinoplanes synthesizes a mixture of a-amylase inhibitor 

15 products, it is only the compound having the relative molecular weight of 

645.5 (acarviosin containing 2 glucose units (Truscheit (1984) Vlllth 
W International Symposium on Medicinal Chemistry, Proc. Vol. 1. Swedish 

pi Academy of Pharmaceutical Sciences, Stockholm, Sweden), which is 

ffl employed under the generic name of acarbose. The fermentation 

ri 20 conditions are selected to ensure that acarbose is the main product of the 

P fermentation. Alternatives are to use particular selectants and strains in 

^ which acarbose is formed as the main product or to employ purification 

processes for achieving selective isolation (Truscheit (1984) Vlllth 

International Symposium on Medicinal Chemistry, Proc. Vol. 1. Swedish 
25 Academy of Pharmaceutical Sciences, Stockholm, Sweden). It is also 

possible to transform the product mixture chemically in order, finally, to 

obtain the desired product acarbose. 

In contrast to the genus Streptomyces, the genus Actinoplanes has not so 
30 far been investigated intensively from the genetic point of view. Methods 
which were established for the genus Streptomyces are not transferable, or 
are not always transferable, to the genus Actinoplanes. In order to use 
molecular biological methods to optimize acarbose production in a 
purposeful manner, the genes for acarbose biosynthesis have to be 
35 isolated and characterized. In this context, the possibility suggests itself of 
attempting to set up a host/vector system for Actinoplanes sp. However, 
this is very tedious and elaborate owing to the fact that studies on 
Actinoplanes have been relatively superficial. 
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The invention described in the present patent application achieves the 
object of cloning the biosynthesis genes for acarbose and homologous 
pseudo-oligosaccharides, with these genes being cloned from 
Streptomyces glaucescens GLA.O, which is a streptomycete which has 
been thoroughly investigated genetically (Crameri et al. (1983) J. Gen. 
Microbiol. 129: 519-527; Hintermann et al. (1984) Mol. Gen. Genet. 196: 
513-520; Motamedi and Hutchinson (1987) PNAS USA 84: 4445-4449; 
Geistlich et al. (1989) Mol. Microbiol. 3: 1061-1069) and which, surprisingly, 
is an acarbose producer. In starch-containing medium, Streptomyces 
glaucescens GLA.O produces pseudo-oligosaccharides having the 
molecular weights 645, 807 and 970. 

Part of the subject matter of the invention is, therefore, the isolation of the 
corresponding biosynthesis genes from Streptomyces glaucescens GLA.O 
and their use for isolating the adjoining DNA regions in order to complete 
the gene cluster of said biosynthesis genes. 

The isolation of the genes for biosynthesizing pseudo-oligosaccharides, 
and the characterization of these genes, are of great importance for 
achieving a better understanding of the biosynthesis of the pseudo- 
oligosaccharides and its regulation. This knowledge can then be used to 
increase the productivity of the Streptomyces glaucescens GLA.O strain 
with regard to acarbose production by means of established classical and 
molecular biological methods. In addition to this, the entire gene cluster 
which encodes the synthesis of the pseudo-oligosaccharides, or individual 
genes from this gene cluster, can also be expressed in other 
biotechnologically relevant microorganisms in order to achieve a further 
increase in, or a simplification of, the preparation of pseudo- 
oligosaccharides such as acarbose. Specific modification of the 
biosynthesis genes can also be used to prepare a strain which exclusively 
produces acarbose having a molecular weight of 645. Since the genes for 
biosynthesizing antibiotics are always present in clusters and are often very 
strongly conserved (Stockmann and Piepersberg (1992) FEMS Microbiol. 
Letters 90: 185-190; Malpartida et al. (1987) Nature 314:642-644), the 
Streptomyces glaucescens GLA.O genes can also be used as a probe for 
isolating the acarbose-encoding genes from Actinoplanes sp., for example. 
The expression of regulatory genes, or of genes which encode limiting 
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steps in the biosynthesis, can result in productivity in Streptomyces 
giaucescens GLA.O, Actinoplanes sp. or corresponding producer strains 
being increased. An increase in productivity can also be achieved by 
switching off (knocking out or mutagenizing) those acarbose biosynthesis 
5 genes which have an inhibitory effect in the biosynthesis. 

One possible strategy for cloning antibiotic biosynthesis genes which have 
not previously been isolated is that of using gene-specific probes 
(Stockmann and Piepersberg (1992) FEMS Microbiol. Letters 90: 185-190; 
10 Malpartida et al. (1987j Nature 314:642-644). These probes can be DNA 
fragments which are P -labeled or labeled in some other way; otherwise, 
the appropriate genes can be amplified directly from the strains to be 
investigated using degenerate PGR primers and isolated chromosomal 
DNA as the template. 

15 

The latter method has been employed in the present study, Pseudo- 
oligosaccharides such as acarbose contain a 4,6-deoxyglucose building 
block as a structural element. The enzyme dTDP-glucose 4,6-dehydratase 
is known to be involved in the biosynthesis of 4,6-deoxyglucose 

20 (Stockmann and Piepersberg (1992) FEMS Microbiol, Letters 90: 185-190). 
Since deoxysugars are a frequent constituent of natural products and 
antibiotics, this enzyme may possibly be a means for isolating the 
corresponding antibiotic biosynthesis genes. Since these genes are always 
present as clusters, it is sufficient to initially isolate one gene; the isolation 

25 and characterization of the adjoining DNA regions can then be undertaken 
subsequently. 

For example a dTDP-glucose 4,6-dehydratase catalyzes a step in the 
biosynthesis of hydroxystreptomycin in Streptomyces giaucescens GLA.O 

30 (Retzlaff et al. (1993) Industrial Microorganisms. Basic and applied 
molecular genetics ASM, Washington DC, USA). Further dTDP-glucose 
4,6-dehydratases have been isolated from other microorganisms, for 
example from Streptomyces griseus (Pissowotzki et al. (1991) Mol. Gen. 
Genet, 231: 113-123), Streptomyces fradiae (Merson-Davies and 

35 Cundcliffe (1994) Mol. Microbiol. 13: 349-355) and Streptomyces 
violaceoruber (Bechthold, et al. (1995) Mol. Gen. Genet. 248: 610-620). 
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It was consequently possible to deduce the sequences for the PCR primers 
for amplifying a dTDP-glucose 4,6-dehydratase from the amino acid 
sequences of already known biosynthesis genes. For this, conserved 
regions in the protein sequences of these enzymes were selected and the 
5 amino acid sequences were translated into a nucleic acid sequence in 
accordance with the genetic code. The protein sequences were taken from 
the EMBL and Genbank databases. The following sequences were used: 
Streptomyces griseus; accession number: X62567 gene: strE (dated 
10.30.1993); Streptomyces violaceoruber; accession number: L37334 

10 gene: graE (dated 04.10,1995); Saccharopolyspora etythraea; accession 
number: L37354 gene: gdh (dated 11.09.1994). A large number of possible 
primer sequences are obtained as a result of the degeneracy of the genetic 
code. The fact that streptomycetes usually contain a G or C in the third 
position of a codon (Wright and Bibb (1992) gene 113: 55-65) reduces the 

15 number of primers to be synthesized. These primer mixtures can then be 
used to carry out a PCR amplification with the DNA from strains to be 
investigated, with the amplification ideally leading to an amplified DNA 
fragment. In the case of highly conserved proteins, this fragment is of a 
predictable length which ensues from the distance between the primers in 

20 the nucleic acid sequence of the corresponding gene. However, an 
experimental mixture of this nature does not inevitably have to result in an 
amplificate. The primers may be too unspecific and amplify a very large 
number of fragments; alternatively, no PCR product is obtained if there are 
no complementary binding sites in the chromosome for the PCR primers 

25 which have been prepared. 

The investigation of the streptomycete strain Streptomyces glaucescens 
GLA.O resulted in an amplified DNA fragment (acbD ) which had the 
expected length of 550 bp. Further investigation showed that, besides 
30 containing a dTDP-glucose 4,6-dehydratase gene for biosynthesizing 
hydroxystreptomycin, this strain surprisingly contains a second dTDP- 
glucose 4,6-dehydratase gene for biosynthesizing pseudo-oligo- 
saccharides such as acarbose. While the two genes exhibit a high degree 
of homology, they are only 65% identical at the amino acid level. 

35 

The acbD probe (see Example 2 and Table 2A) was used to isolate, from 
Streptomyces glaucescens GLA.O, a 6.8 kb Psti DNA fragment which 
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encodes a variety of genes (acbA, acbB, acdC, acbD, acbE and acbF) 
which are involved in the biosynthesis of the pseudo-oligosaccharides. 

Deleting the acbBCD genes (aminotransferase, acbB, dTDP-glucose 
5 synthase, acbC, dTDP-giucose 4,6-dehydratase, acbD, see Example 6) 
resulted in the production of a mutant of Streptomyces glaucescens GLA.O 
which no longer produces any pseudo-oligosaccharides in the production 
medium. The involvement of the acbBCD genes in the synthesis of 
pseudo-oligosaccharides was therefore verified by deleting the 
10 corresponding loci. 

The two genes, i.e. dTDP-glucose synthase and dTDP-glucose 4,6- 
dehydratase, ought to be involved in the biosynthesis of the deoxysugar of 
the pseudo-oligosaccharides, as can be concluded from the function of 

15 thoroughly investigated homologous enzymes (see above). The amino- 
transferase (encoded by the acbB gene) is probably responsible for 
transferring the amino group either to the sugar residue or to the cyclitol 
residue. By analyzing the protein sequence of acbB, an amino acid motif 
was found which is involved in binding pyridoxal phosphate. This motif is 

20 typical of class III aminotransferases (EC 2.6.1.11; EC 2.6.1.13; EC 
2.6.1.18; EC 2.6.1.19; EC 2.6.1.62; EC 2.6.1.64; EC 5.4.3.8). The precise 
enzymic function of acbB can only be elucidated by further investigation of 
the biosynthesis of the pseudo-oligosaccharides. acbE encodes a 
transcription-regulating protein which exhibits a great deal of similarity to 

25 DNA-binding proteins which possess a helix-turn-helix motif (e.g. Bacillus 
subtilis DegA, P37947: Swiss-Prot database). Thus, the transcription 
activator CcpA from Bacillus subtilis inhibits the formation of a-amylase in 
the presence of glucose, for example (Henkin et al. (1991) Mol. Microbiol. 
5: 575-584). Other representatives of this group are proteins which 

30 recognize particular sugar building blocks and are able to exhibit a positive 
or negative effect on the biosynthesis of metabolic pathways. The 
biosynthesis of the pseudo-oligosaccharides is also regulated in 
Streptomyces glaucescens GLA.O. It was only previously possible to 
demonstrate the synthesis of pseudo-oligosaccharides on starch- 

35 containing media. While this method indicated that AcbE might be 
responsible for regulating pseudo-oligosaccharide synthesis, the precise 
mechanism is still not known. However, molecular biological methods can 
now be used to modify the gene specifically in order to obtain an increased 
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rate of pseudo-oligosaccharide biosynthesis. Furthermore, the DNA site at 
which acbE binds can be identified by means of so-called gel shift assays 
(Miwa et al. (1994) Microbiology 140: 2576-2575). An increase in the rate 
at which acarbose is biosynthesized can be achieved after identifying and 
5 then modifying the promoters and other regulatory DNA regions which are 
responsible for the transcription of the pseudo-oligosaccharide genes. 

At present, the function of acbF is still not definitely known. The 
corresponding gene product exhibits homologies with sugar-binding 

10 proteins such as the sugar-binding protein from Streptococcus mutans 
(MsmE; Q00749: Swissprot database), making it probable that it is involved 
in the biosynthesis of the pseudo-oiigosaccharides. The gene product of 
the acbA gene exhibits homologies with known bacterial ATP-binding 
proteins (e.g. from Streptomyces peucitus DrrA, P32010: SwissProt 

15 database). The AcbA protein possesses the typical ATP/GTP binding 
motif, i.e. the so-called P loop. These proteins constitute an important 
component of so-called ABC transporters, which are involved in the active 
transport of metabolites at biological membranes (Higgins (1995) Cell 82: 
693-696). Accordingly, AcbA could be responsible for exporting pseudo- 

20 oligosaccharides out of the cell or be involved in importing sugar building 
blocks for biosynthesizing a-amylase inhibitors such as maltose. 

AH streptomycete genes for biosynthesizing secondary metabolites which 
have so far been analyzed are arranged in a cluster. For this reason, it is to 

25 be assumed that the acarbose biosynthesis genes according to the 
application are also arranged in such a gene cluster. The remaining genes 
which are relevant for pseudo-oligosaccharide biosynthesis can therefore 
also be isolated by isolating the DNA regions which adjoin the 6.8 kb Pstl 
DNA fragment according to the invention. As has also already been 

30 mentioned above, it is readily possible to isolate homologous gene clusters 
from microorganisms other than Streptomyces glaucescens GLA.O. 

The invention therefore relates to a recombinant DNA molecule which 
comprises genes for biosynthesizing acarbose and homologous pseudo- 
35 oligosaccharides, in particular a recombinant DNA molecule in which 
individual genes are arranged, with respect to their direction of transcription 
and order, as depicted in Figure 3 and/or which exhibits a restriction 
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enzyme cleavage site pattern as depicted in Figure 3, and, preferably, to a 
recombinant DNA molecule which 

(a) comprises a DNA sequence according to Table 4, or parts thereof; 

(b) comprises a DNA sequence which is able to hybridize, under 
5 stringent conditions, with the DNA molecule according to (a) f or 

parts thereof; or 

(c) comprises a DNA sequence which, because of the degeneracy of 
the genetic code, differs from the DNA molecules according to (a) 
and (b) but which permits the expression of the proteins which can 

10 be correspondingly expressed using the DNA molecule according to 

(a) and (b), or parts thereof. 

The invention furthermore relates to a recombinant DNA molecule which 
comprises the acbA gene, in particular which is characterized in that it 

15 comprises the DNA sequence of nucleotides 1 to 720 according to Table 4, 
or parts thereof; to a recombinant DNA molecule which comprises the 
acbB gene, in particular which is characterized in that it comprises the DNA 
sequence of nucleotides 720 to 2006 according to Table 4, or parts 
thereof; to a recombinant DNA molecule which comprises the acbC gene, 

20 in particular which is characterized in that it comprises the DNA sequence 
of nucleotides 2268 to 3332 according to Table 4, or parts thereof; to a 
recombinant DNA molecule which comprises the acbD gene, in particular 
which is characterized in that it comprises the DNA sequence of 
nucleotides 3332 to 4306 according to Table 4, or parts thereof; to a 

25 recombinant DNA molecule which comprises the acbE gene, in particular 
which is characterized in that it comprises the DNA sequence of 
nucleotides 4380 to 5414 according to Table 4, or parts thereof; and to a 
recombinant DNA molecule which comprises the acbF gene, in particular 
which is characterized in that it comprises the DNA sequence of 

30 nucleotides 5676 to 6854 according to Table 4, or parts thereof. 

The invention furthermore relates to oligonucleotide primers for the PCR 
amplification of a recombinant DNA molecule which is as described above 
and which comprises genes for biosynthesizing acarbose and homologous 
35 pseudo-oiigosaccharides, with the primers having, in particular, the 
sequence according to Table 1 . 
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The invention furthermore relates to a vector which comprises a 
recombinant DNA molecule which comprises a DNA molecule as described 
in the penultimate and prepenuitimate paragraphs, in particular which is 
characterized in that the vector is an expression vector and said DNA 
5 molecule is linked operatively to a promoter sequence, with the vector 
preferably being being suitable for expression in host organisms which are 
selected from the group consisting of E. coli, Bacillus subtilis, 
Actinomycetales, such as Streptomyces, Actinoplanes, Ampullariella and 
Streptosporangium strains, Streptomyces hygroscopicus van limoneus, 

10 Streptomyces glaucescens and also biotechnologically relevant fungi (e.g. 
Aspergillus niger, Penicillium chrysogenum) and yeasts (e.g. 
Saccharomyces cerevisiae), with Streptomyces glaucescens GLA.O or 
Actinoplanes sp. being very particularly preferred. Since the operative 
linkage of said DNA molecule to promoter sequences of the vector is only 

15 one preferably embodiment of the invention, it is also possible for 
expression to be achieved using promoter sequences which are 
endogenous in relation to the DNA molecule, e.g. the promoters which are 
in each case natural, or the natural promoters which have been mutated 
with regard to optimizing the acarbose yield. Such natural promoters are 

20 part of the DNA molecule according to the invention. 

The invention furthermore relates to a vector which comprises a DNA 
molecule according to the invention for use in a process for eliminating or 
altering natural acarbose biosynthesis genes in an acarbose-producing 
25 microorganism. Such a vector is preferably selected from the group 
consisting of pGM160 and vectors as described in European Patents EP 0 
334 282 and EP 0 158 872. 

The invention furthermore relates to a host cell which is transformed with 
30 one of the above-described DNA molecules or vectors, in particular 
characterized in that said host cell is selected from the group consisting of 
E. coli, Bacillus subtilis, Actinomycetales, such as Streptomyces, 
Actinoplanes, Ampullariella or Streptosporangium strains, Streptomyces 
hygroscopicus var. limoneus or Streptomyces glaucescens, and also 
35 biotechnologically relevant fungi (e.g. Aspergillus niger and Penicillium 
chrysogenum) and yeasts (e.g. Saccharomyces cerevisiae); it is very 
particularly preferred for it to be selected from the group consisting of 
Streptomyces glaucescens GLA.O and Actinoplanes sp. 
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The invention furthermore relates to a protein mixture which can be 
obtained by expressing the genes of the recombinant DNA molecule 
according to the invention, comprising genes for biosynthesizing acarbose 
and homologous pseudo-oligosaccharides, in particular characterized in 
that the DNA molecule 

(a) comprises a DNA sequence according to Table 4, or parts thereof; 

(b) comprises a DNA sequence which is able to hybridize, under 
stringent conditions, with the DNA molecule according to (a) or parts 
thereof; or 

(c) comprises a DNA sequence which, because of the degeneracy of 
the genetic code, differs from the DNA molecules according to (a) 
and (b) but which permits the expression of the proteins which can 
correspondingly be expressed using the DNA molecule according to 
(a) and (b), or parts thereof. 

The invention furthermore relates to isolated proteins which can be 
obtained by expressing the genes which are encoded by the DNA molecule 
described in the previous paragraph. 

The following statements apply to all the individual genes identified within 
the context of the present invention and have only been brought together 
for reasons of clarity: the invention furthermore relates to a protein which is 
encoded by a recombinant DNA molecule as described in the last 
paragraph but one, in particular characterized in that it comprises the DNA 
sequence of nucleotides 1 to 720 or 720 to 2006 or 2268 to 3332 or 3332 
to 4306 or 4380 to 5414 or 5676 to 6854 according to Table 4 or parts 
thereof; a protein is very particularly preferred which is encoded by the 
acbA gene or the acbB gene or the acbC gene or the acbD gene or the 
acbE gene or the acbF gene, and which comprises the amino acid 
sequence according to Table 4 or parts thereof. 

The invention furthermore relates to a process for obtaining the proteins 
which were described above as being part of the subject-matter of the 
invention, which process is characterized in that 

(a) the proteins are expressed in a suitable host cell, in particular which 
is characterized in that said host cell is selected from the group 
consisting of E. coli, Bacillus subtilis, Actinomycetales, such as 
Streptomyces, Actinoplanes, Ampullariella or Streptosporangium 
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strains, Streptomyces, hygroscopicus var. limoneus or Streptomyces 
glaucescens, and also biotechnologicaily relevant fungi (e.g. 
Aspergillus niger and Penicillium chrysogenum) and yeasts (e.g. 
Saccharomyces cerevisiae); with the host cell very particularly 
5 preferably being selected from the group consisting of Streptomyces 

glaucescens GLA.O and Actinoplanes sp M and 
(b) are isolated. 



The invention furthermore relates to a process for preparing acarbose, 
1 0 characterized in that 

(a) one or more genes of the recombinant DNA molecule which 
comprises a DNA sequence according to Table 4 or parts thereof or 
which comprises a DNA sequence which is able to hybridize, under 
stringent conditions, with the DNA molecule according to Table 4, or 

15 parts thereof, or which comprises a DNA sequence which, because 

of the degeneracy of the genetic code, differs from the DNA 
molecules which have just been described but which permits the 
expression of the proteins which can be correspondingly expressed 
using these DNA molecules, or parts thereof, is/are used for 

20 expression in a suitable host cell which is selected, in particular, 

from the same group as in the last paragraph, and 

(b) the acarbose is isolated from culture supernatants of said host cell. 

The invention furthermore relates to a process for preparing acarbose, 
25 characterized in that 

(a) one or more genes of the recombinant DNA molecule which 
comprises a DNA sequence according to Table 4 or parts thereof or 
which comprises a DNA sequence which is able to hybridize, under 
stringent conditions, with the DNA molecule according to Table 4, or 

30 parts thereof, or which comprises a DNA sequence which, because 

of the degeneracy of the genetic code, differs from the DNA 
molecules which have just been described but which permits 
expression of the proteins which can be correspondingly expressed 
using the DNA molecules, or parts thereof, are eliminated in an 

35 acarbose-producing host cell, in particular Streptomyces 

glaucescens GLA.O and Actinoplanes sp., and 

(b) the acarbose is isolated from said host cell. 
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In this connection, the elimination of one or more genes can be effected by 
means of standard molecular biological methods, for example using the 
above-described vectors (pGM160 and others). A gene to be eliminated 
could, for example, be the acbE gene, which propably has a regulatory 
5 function. Genes could likewise be eliminated with the aim of obtaining pure 
acarbose as the only fermentation product and no longer obtaining a 
mixture of homologous pseudo-oligosaccharides (see above). The 
elimination of said genes is preferably achieved using the vectors which 
have been described above for this purpose. 

10 

The invention furthermore relates to a process for preparing acarbose, 
characterized in that the processes for preparing acarbose which have 
been described in the previous two paragraphs are combined with each 
other, such that, therefore, one or more of said genes is/are expressed 
15 artificially and one or more of said genes is/are eliminated. 

The invention furthermore relates to a process for altering the gene 
expression of endogenous acarbose biosynthesis genes by mutating the 
respective gene promoter in order to obtain improved yields of acarbose. In 

20 this context, known methods of homologous recombination can be used to 
introduce the mutations into the production strain to be improved. These 
mutations can be transitions, deletions and/or additions. An "addition" can, 
for example, denote the addition of one single nucleotide or several 
nucleotides or of one or more DNA sequences which have a positive 

25 regulatory effect and which bring about an enhancement of the expression 
of an endogenous gene for biosynthesizing acarbose. The converse case, 
i.e. the addition of a DNA sequence which has a negative regulatory effect 
for repressing an endogenous acarbose biosynthesis gene is also a 
preferred form of an addition. "Transitions" may, for example, be nucleotide 

30 exchanges which reduce or amplify the effect of regulatory elements which 
act negatively or positively. "Deletions" can be used to remove regulatory 
elements which act negatively or positively. The endogenous genes of this 
process are preferably present in Actinomycetales, such as Streptomyces, 
Actinoplanes, Ampullarielia or Streptosporangium strains, Streptomyces 

35 hygroscopicus var. limoneus or Streptomyces gtaucescens; very 
particularly, they are present in Streptomyces glaucescens GLA.O and 
Actinoplanes sp. 
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The invention furthermore relates to the use of Streptomyces GLA.O for 
obtaining acarbose. 

The invention furthermore relates to the use of Streptomyces GLA.O for 
preparing mutants of this strain by the "classical route", which mutants 
make it possible to achieve a more abundant production of acarbose. The 
methods for preparing improved natural product producers of this nature 
have been known for a long time and frequently make use of classical 
steps of mutagenesis and selection. 

The invention furthermore relates to a process for completing the gene 
cluster for biosynthesizing acarbose and homologous polysaccharides 
according to Table 4, characterized in that 

a) hybridization probes which are derived from the DNA molecule 
according to Table 4 are prepared, 

b) these hybridization probes are used for the genomic screening of 
DNA libraries obtained from Streptomyces glaucescens GLA.O, and 

c) the clones which are found are isolated and characterized. 

The invention furthermore relates to a process for completing the gene 
cluster for biosynthesizing acarbose and homologous pseudo- 
oligosaccharides according to Table 4, characterized in that, proceeding 
from the recombinant DNA molecule according to Table 4, 

a) PCR primers are prepared, 

b) these PCR primers are used to accumulate DNA fragments of 
genomic DNA from Streptomyces glaucescens GLA.O, with these 
primers being combined with those primers which hybridize from 
sequences of the vector system employed, 

c) the accumulated fragments are isolated and characterized. 

The invention furthermore relates to a process for isolating a gene cluster 
for biosynthesizing acarbose and homologous pseudo-oligosaccharides 
from acarbose-producing microorganisms other than Streptomyces 
glaucescens GLA.O, characterized in that, proceeding from the 
recombinant DNA molecule according to Claim 4, 
a) hybridization probes are prepared, 
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b) these hybridization probes are used for the genomic or cDNA 
screening of DNA libraries which have been obtained from the 
corresponding microorganism, and 

c) the clones which are found are isolated and characterized. 

The invention furthermore relates to a process for isolating a gene cluster 
for biosynthesizing acarbose and homologous pseudo-oligosaccharides 
from acarbose-producing microorganisms other than Streptomyces 
glaucescens GLA.O, characterized in that, proceeding from the 
recombinant DNA molecule according to Claim 4, 

a) PCR primers are prepared, 

b) these PCR primers are used for accumulating DNA fragments of 
gemonic DNA or cDNA from a corresponding microorganism, 

c) the accumulated fragments are isolated and characterized, and 

d) where appropriate, employed in a process as described in the 
previous paragraph. 

The described processes for isolating a gene cluster for the biosynthesis of 
acarbose and homologous pseudo-oligosaccharides from acarbose- 
producing microorganisms other than Streptomyces glaucescens GLA.O 
are characterized in that the microorganisms are selected from the group 
consisting of Actinomycetales, such as Streptomyces, Actinoplanes, 
Ampullariella and Streptosporangium strains, Streptomyces hygroscopicus 
var. limoneus and Streptomyces glaucescens, preferably from the group 
consisting of Streptomyces glaucescens GLA.O and Actinoplanes sp. 

The invention furthermore relates to the use of Streptomyces glaucescens 
GLA.O for isolating acarbose. 

The invention will now be explained in more detail with the aid of the 
examples, tables and figures, without being restricted thereto. 

All the plasmid isolations were carried out using a Macherey and Nagel 
(Duren, Germany) isolation kit (Nucleobond®) in accordance with the 
manufacturer's instructions. Molecular biological procedures were carried 
out in accordance with standard protocols (Sambrock et al. (1989) 
Molecular cloning: A Laboratory Manual, 2nd ed. Cold Spring Harbor 
Laboratory, USA) or in accordance with the instructions of the respective 
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manufacturer. DNA and protein sequences were examined using Genetics 
Computer Group Software, Version 8 (progams: FastA, TFastA, BlastX, 
Motifs, GAP and CODONPREFERENCE) and the SwissProt (release 32), 
EMBL (release 46) and Prosite (release 12.2) databases. The molecular 
5 biological manipulation of Streptomyces glaucescens and Actinoplanes 
(DNA isolation and DNA transformations) were carried out as described in 
Hopwood et al.: Genetic Manipulation of Streptomyces: A Laboratory 
Manual. The John Innes Foundation, Norwich, UK, 1985 and Motamedi 
and Hutchinson: Cloning and heterologous expression of a gene cluster for 
10 the biosynthesis of tetracenomycin C, the anthracycline antitumor antibiotic 
of Streptomyces glaucescens. Proc. Natl. Acad. Sci. USA 84:4445-4449 
(1987). 

In general, hybridizations were performed using the "Non-radioactive DNA 
15 labeling kit" from Boehringer Mannheim (Cat. No. 1 175033). The DNA was 
visualized using the "Luminescent Detection Kif from Boehringer 
Mannheim (Cat. No. 1363514). In all the examples given in this patent 
application, hybridization was carried out under stringent conditions: 68°C, 
16 h. 5xSSC, 0.1% N-laurylsarcosine, 0.02% SDS, 1% Blocking Reagent 
20 (Boehringer Mannheim). SSC denotes 0.15M NaCI/0.015M sodium citrate. 
The definition of "stringent conditions" which is given here applies to all 
aspects of the present invention which refer to "stringent conditions". In this 
connection, the manner of achieving this stringency, i.e. the cited 
hybridization conditions, is not intended to have a limiting effect since the 
25 skilled person can select other conditions as well in order to achieve the 
same stringent conditions, e.g. by means of using other hybridization 
solutions in combination with other temperatures. 

Example 1 : Synthesis and sequences of the PCR primers and 
30 amplification of the fragments from S. glaucescens GLA.O 

The PCR was carried out under standard conditions using in each case 
100 pmol of primer 1 and of primer 2 in 100 \i\ of reaction mixture 

35 PCR buffer 1 10 \i\ 

PCR primers in each case 2.5 pi 

dNTPs in each case 0.2 mM 

BSA (10 mg/ml) 1 |il 
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Template DNA 1 |ig (1 ^il) 
2 

Taq polymerase (5 units/ml) 1 .5 jil 

H2O to make up to 1 00 y\ 

1 : Promega 
2 

5 : Boehringer Mannheim 

The samples are overlaid with 75 jxl of mineral oil and the amplification is 
carried out using a Perkin Elmer TC1 DNA thermal cyler. 

1 0 Parameters: 

Cycles Temperature Duration 



1 96°C 5 min 

74°C 5 min 

30 95°C 1.5 min 

74°C 1 .5 min 

1 74°C 5 min 



Table 1 lists the sequences of the degenerate primers which should be 
used for amplifying dTDP-glucose dehydratases from different 
1 5 streptomycetes. 

Table 1: Primer sequences for amplifying dTDP-glucose 4,6- 
dehydratases 

20 Primer 1: CSGGSGSSGCSGGSTTCATSGG (SEQ ID NO.: 1) 
Primer 2: GGGWVCTGGYVSGGSCCGTAGTTG (SEQ ID NO.: 2) 
In this table, S=G or C, W=A or T, V=A or G, and Y=T or C. 

25 

Example 2: DNA sequences of the PCR fragments isolated from 
Streptomyces glaucescens GLA.O 

The sequencing was performed by the dideoxy chain termination method 
30 of Sanger et al. (PNAS USA, 74: 5463-5467 (1977)). The reactions were 
carried out using the Auto Read Sequenzing Kit from Pharmacia Biotech 
(Freiburg, Germany) in accordance with the manufacturer's instructions. An 
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® 

ALF DNA Sequencer from Pharmacia Biotech (Freiburg, Germany) was 
used for separation and detection. 

® 

The subsequent cloning of the PCR fragments (Sure Clone Kit , 
5 Pharmacia Biotech, Frieburg) into the E. coli vector pUC 18, and the 
sequencing of the fragment, provided support for the supposition that the 
fragment encoded a dTDP-glucose 4,6-dehydratase. However, 2 different 
genes were isolated which both exhibit high degrees of homology with 
dTDP-glucose 4,6-dehydratase but are not identical, in that which follows, 
10 the PCR fragments are designated acbD and HstrE . 

The sequences of the isolated PCR fragments are shown in Table 2A and 
2B and the homology comparison of the deduced amino acid sequences of 
HstrE and acbD is shown in Table 2C. The two proteins exhibit an identity 
15 of only 65%. 

DNA sequence of acbD (primer-binding sites are underlined, 
SEQ ID NO.: 3) 

Primer 1 

CCCGGGCGGG GgGGGGTTgA TgGG CTggfig CTACGTCCGC CGGCTCCTCT 
CGCCCGGGGC CCCCGGCGGC GTCGCGGTGA CCGTCCTCGA CAAACTCACC 
TACGCCGGCA GCCTCGCCCG CCTGCACGCG GTGCGTGACC ATCCCGGCCT 
CACCTTCGTC CAGGGCGACG TGTGCGACAC CGCGCTCGTC GACACGCTGG 
CCGCGCGGCA CGACGACATC GTGCACTTCG CGGCCGAGTC GCACGTCGAC 
CGCTCCATCA CCGACAGCGG TGCCTTCACC CGCXCCAACG TGCTGGGCAC 
CCAGGTCCTG CTCGACGCCG CGCTCCGCCA CGGTGTGCGC ACCCTCGTGC 
ACGTCTCCAC CGACGAGGTG TACGGCTCCC TCCCGCACGG GGCCGCCGCG 
GAGAG CGACC CCCTGCTCCC GACCTCGCCG TACGCGGCGT CGAAGGCGGC 
CTCGGACCTC ATGGCGCTCG CCCACCACCG CACCCACGGC CTGGACGTCC 
GGGTGACCCG CTGTTCGA AC AACTftCGGCC CGCACCAGTT CgC GGG 

Primer 2 



Table 2A: 



l 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 

20 
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* 

Table 2B: DNA sequence of HstrE (primer-binding sites are underlined, 
SEQ ID NO.: 4) 

Primer 2 

1 CCCC CGGTGC TGGTAGGGGC eGTAGTTG TT GGAGCAGCGG GTGATGCGCA 

51 CGTCCAGGCC GTGGCTGACG TGCATGGCCA GCGCGAGCAG GTCGCCCGAC 

101 GCCTTGGAGG TGGCATAGGG GCTGTTGGGG CGCAGCGGCT CGTCCTCCGT 

151 CCACGACCCC GTCTCCAGCG AGCCGTAGAC CTCGTCGGTG GACACCTGCA 

201 CGAAGGGGGC CACGCCGTGC CGCAGGGCCG CGTCGAGGAG TGTCTGCGTG 

251 CCGCCGGCGT TGGTCCGCAC GAACGCGGCG GCATCGAGCA GCGAGCGGTC 

301 CACGTGCGAC TCGGCGGCGA GGTGCACGAC CTGGTCCTGG CCGGCCATGA 

351 CCCGGTCGAC CAGGTCCGCG TCGCAGATGT CGCCGTGGAC GAAGCGCAGC 

401 CGGGGGTGGT CGCGGACCGG GTCGAGGTTG GCGAGGTTGC CGGCGTAGCT 

451 CAGGGCGTCG AG CACGGTG A CGACGGCGTC GGGCGGCCCG TCCGGACCGA 
501 GGAGGGTGCG GACGTAGTGC KAfieeCATGA ACCCCGCCGC C 

Primer 1 

Table 2C: Homology comparison of the deduced amino acid sequences 
of the PCR products HstrE and acbD (program: GAP) 

Quality: 196.3 Length: 182 

Ratio: 1.091 Gaps: 0 

Percent similarity: 77.654 Percent identity: 65.363 

PCRstrE.Pep x PCRacbD.Pep 

1 . . AAGFWGSHYVRTLL^PDGPPDAVVTVI^ 48 

Hi hi I Ml lh|:sh:..|llll hilMht-IIIM I M 

1 PGGAGFIGSAYVKUXSPGAPGGVAVTVIJ5KLTYAGSIARLHAVRDHPGL 50 

• • • 

49 RFVHGDICDADLVDRVMAGQDQVVHLAAESHVDRSLLDAAAFVRTNAGGT 98 

11*11^11 — 111 s I ihHhllllllllh I - • I I • I I I - M „ 
51 TFVQGDVCDTALVDTLAAWiDDIVHFAAESHVDRSITDSGAFTRTNVLGT 100 

99 QTLLDAALRHGVAPFVQVSTDEVYGSLETGSWTEDEPLRPNSPYATSKAS 148 

l-IIIIIIIIII -shlllllllllh h •h^Ii-li!!:Uh 

101 QVIXDAAIJmGVRTLVHVSTDEVYGSLPHGAAAESDPIXPTSPYAASKAA 150 

• • • 

149 GDLLALAMHVSHGIiDVRITRCSNNYGPYQHPG 180 

s I h 1 1 1 I -lllllhllllllllhl I 

151 SDLMALAHHRTHGLDVRVTRCSNNYGPHQFP . 181 

in each case, upper row: SEQ ID NO.: 5 
in each case, lower row: SEQ ID NO.: 6 
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Example 3: Southern analysis using chromosomal DNA from 
Streptomyces glaucescens GLA.O and the isolated and 
labeled PGR fragments 

5 The cells were grown in R2YENG medium and harvested for the DNA 
isolation after 30 h. The chromosomal DNA was isolated from 
S. glaucescens GLA.O as described in Hopwood et al. (1985) Genetic 
manipulations of Streptomyces: a laboratory manual. The John Innes 
Foundation, Norwich UK). 

10 

A Southern blot analysis was carried out using the S. glaucescens GLA.O 
producer strain chromosomal DNA, which was digested with Pstl, Bglll and 
BamHI, using the labeled probes consisting of the acbD and HstrE PCR 
fragments. The two PCR fragments were labeled with digoxygenin in 

15 accordance with the manufacturer's (Boehringer Mannheim; Mannheim) 
instructions, and a digest of the Streptomyces glaucescens GLA.O 
producer strain chromosomal DNA was separated on an agarose gel. The 
DNA was transferred by capillary transfer to nylon membranes and DNA 
regions which were homologous with the labeled probes were 

20 subsequently visualized following hybridization. 

The two genes label different DNA regions (Fig. 1 and Fig. 2), with the 
fragments which were labeled by HstrE having to be gene fragments from 
Streptomyces glaucescens GLA.O hydroxystreptomycin biosynthesis. 

25 While the DNA sequence is not published, the high degree of homology of 
the protein sequence deduced from HstrE with StrE (Pissowotzki et al. 
(1991) Mol. Gen. Genet. 231: 113-123) from Streptomyces griseus N2-3- 
11 streptomycin biosynthesis (82% identity) and the concordance of the 
HstrE -labeled DNA fragments (Fig. 1) with the published restriction map of 

30 the Streptomyces glaucescens GLA.O hydroxystreptomycin gene cluster 
(Retzlaff et al. (1993) Industrial Microorganisms. Basic and applied 
molecular genetics ASM, Washington DC, USA) permits this conclusion. 
The fragments which were labeled by the acbD probe (Fig. 2) belong to a 
DNA region which has not previously been investigated. This region 

35 encodes the enzymes for biosynthesizing the Streptomyces glaucescens 
GLA.O pseudo-oligosaccharides. 
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Example 4: Cloning the 6.8 kb Pstl fragment 

* 

Inter alia, the acbD PGR fragment labels a 6.8 kB Pstl DNA fragment 
(Fig. 2). This DNA fragment was isolated as follows. The region of the gel 
5 was excised with a razor blade and the DNA was isolated from the gel 
using an isolation kit from Pharmacia Biotech and cloned into plasmid 
pUC19 which had been cut with the restriction enzyme Pstl (plasmid 
pacbl); this latter plasmid was then transformed into the E. coli strain 
DH5a. The individual clones were subcultured from these plates and a 
10 plasmid DNA isolation was carried out using these clones. A PCR 
amplification using the above-described primers 1 and 2 (Tab. 1) was 
carried out using the DNA from these clones (250). In this manner, the 
appropriate E. coli clone containing the 6.8 kb Pstl fragment was isolated. 

15 Example 5: Sequencing the isolated 6.8 kb Psti DNA fragment 

The DNA was digested with various restriction enzymes and individual 
DNA fragments were cloned into pUC19. The DNA sequence of the entire 
fragment, which is shown in Tab. 4 (SEQ ID NO.: 7), was then determined. 

20 The DNA sequence of the 6.8 kb Pstl fragment was only partially 
confirmed by supplementary sequencing of the opposing strand. Several 
open reading frames, encoding various proteins, were found (programs: 
CODONPREFERENCE and BlastX). A total of 6 coding regions was found, 
i.e. a gene having a high degree of homology with ATP-binding protein, 

25 acbA, an aminotransferase acbB, a dTDP-glucose synthase acbC, a 
dTDP-glucose dehydratase acbD, a regulatory gene having homologies 
with the Lacl protein family acbE, and a protein having similarities to sugar- 
binding proteins acbF. The sequences of the acbA and acbF genes were 
only determined in part. The homologies with other proteins from the 

30 databases, and the properties of the putative proteins, are summarized in 
Tab. 3. Fig. 3 shows, in summary form, a restriction map of the fragment, 
containing the most important restriction cleavage sites mentioned in the 
text, and the arrangement of the identified open reading frames. 
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Table 3: Analysis of the identified open reading frames on the 6.8 kb 
Pstl fragment from Streptomyces glaucescens GLA.O 



ORF 


Amino acid 


MW 


FastA 


%ldentity 


Accession 
number 


acbA 


239 




MalK E coli 


29% 


P02914 


acbB 


429 


45618 


DgdA, Burkholderia 
cepacia 


32% 


P16932 


acbC 


355 


37552 


StrD, Streptomyces 
griseus 


60% 


P08075 


acbD 


325 


35341 


StrE, Streptomyces 
griseus 


62% 


P29782 


acbE 


345 


36549 


DegA, Bacillus, 
subtilis 


31% 


P37947 


acbF 


396 




MalE T E. coii 


22% 


P02928 



* c 

5 incomplete open reading frame; 3 Swiss-Prot database (release 32) 

Example 6: Deletion of genes acbBCD for pseudo-oiigosaccharide 
biosynthesis from the Streptomyces glaucescens GLA.O 
chromosome 

10 

Evidence that the identified DNA fragment encoded pseudo- 
oiigosaccharide biosynthesis genes was produced as follows. A 3.4 kb 
gene region (EcoR1/Sstl fragment b, Fig. 3) was replaced with the 
erythromycin resistance gene (1 .6 kb) and cloned, together with flanking 

15 DNA regions from the 6.8 kb Pstl fragment (pacbl) into the temperature- 
sensitive plasmid pGM160. The plasmid was constructed as described in 
the following: the 2.2 kb EcoR1/Hindlll fragment (c, Fig. 3) from plasmid 
pacbl was cloned into pGEM7zf (Promega, Madison, WI, USA; plasmid 
pacb2), and the 1 kb Sstl fragment from pacbl (a, Fig. 3) was cloned into 

20 pUC19 (plasmid pacb3). A ligation was then carried out using the following 
fragments. The plasmid pGM160 (Muth et at. (1989) Mol. Gen Genet. 
219:341-348) was cut with BamH/Hindlll, the plasmid pacb2 was cut with 
Xbal/BamHI (c, Fig. 3), the plasmid pacb3 was cut with EcoRI/Hindlll (a, 
Fig 3), and the plasmid plJ4026 (Bibb et ai. (1985) Gene 38:215-226) was 

25 cut with EcoRI/Xbal in order to isolate the 1.6 kb ermE resistance gene. 
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The fragments were iigated in a mixture and transformed into E. coli DH5a 
and selected on ampiciilin. The resulting plasmid, i.e. pacb4, was isolated 
from E. coli DH5a, tested for its correctness by means of restriction 
digestion and then transferred by protoplast transformation into 
5 S. glaucescens GLA.O. The transformants were selected with thiostrepton 
at 27°C in R2YENG agar. The transformants were subsequently incubated 
at the non-permissive temperature of 39°C and integration of the plasmid 
into the genome by way of homologous recombination thereby instituted 
(selection with thiostrepton (25 ]igJm\) and erythromycin (50 pxj/mi)). Under 

10 these conditions, the only clones which can grow are those in which the 
plasmid has become integrated into the genome. The corresponding 
clones were isolated, caused to sporulate (medium 1, see below) and 
plated out on erythromycin-containing agar (medium 1). Individual clones 
were isolated once again from this plate and streaked out on both 

15 thiostrepton-containing medium and erythromycin-containing medium. The 
clones which were erythromycin-resistant but no longer thiostrepton- 
resistant were analyzed. In these clones, the acbBCD genes had been 
replaced with ermE. Several clones were examined and the strain S. 
glaucescens GLA.O Aacb was finally selected as the reference strain 

20 (erythromycin-resistant, thiostrepton-sensitive) for further investigation. 

Medium 1 

Yeast extract 4 g/L 

25 Malt extract 10 g/L 

Glucose 4 g/L 

Agar 1 5 g/L 

pH 7.2 

30 A further experiment examined whether the corresponding strain still 
produced acarbose. Some clones were grown and investigated for 
formation of the a-amylase inhibitor in a bioassay; however, no activity was 
found. The mutants were subsequently further characterized by means of 
Southern hybridization. Integration of the ermE gene had taken place at 

35 the predicted site. Fig. 4 shows a Southern hybridization which was carried 
out with the wild type and with the Streptomyces glaucescens GLA.O Aacb 
deletion mutant. The Sstl fragment from pacb3 was used as the probe. 
The chromosomal DNA was isolated from the wild type and mutant and 
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digested with the enzymes Pstl and Pstl/Hindlll. The fragment pattern 
obtained for the deletion mutant corresponds to the predicted 
recombination event. The wild type exhibits the unchanged 6.8 kb Pstl 
fragment, whereas the mutant exhibits a fragment which has been 
5 truncated by 1.8 kb (compare lanes 1 and 3, Fig. 4). Integration of the 
ermE resistance gene additionally introduced an internal Hindlll cleavage 
site into the Pstl fragment (compare lanes 2 and 4, Fig. 4), 

Example 7: Inhibition of a-amylase by acarbose 

10 

Using an enzymic test for detecting starch (TC-Starch, Boehringer- 
Mannheim, Cat. No. 297748), it was possible to demonstrate that the 
compound isolated from Streptomyces glaucescens GLA.O inhibits 
a-amylase. Test principle: starch is cleaved into D-glucose by 

15 amyloglucosidase. The glucose is then converted with hexokinase into 
glucose-6-phosphate and the latter is converted with gfucose-6-phosphate 
dehydrogenase into D-giuconate-6-phosphate. This reaction produces 
NADPH, whose formation can be determined photometrically. Acarbose 
inhibits the a-amylase and thereby prevents the formation of D-glucose 

20 and ultimately the formation of NADPH as well. 

Example 8: Medium for growing S. glaucescens GLA.O and producing 
acarbose 

25 The fermentation was carried out, at 27°C on an orbital shaker at 120 rpm, 
in 500 ml Erlenmeyer flasks which were fitted with side baffles and which 
contained 100 ml of medium 2. The fermentation was terminated after 2 or 
3 days. The pseudo-oligosaccharides were detected in a plate diffusion 
test as described in Example 9. No a-amylase inhibitors were produced 

30 when medium 3 was used. This means that the production of the pseudo- 
oligosaccharides is inhibited by glucose. Other sugars, such as maltose 
and sucrose, or complex sugar sources (malt extract) can also come into 
consideration for producing pseudo-ofigosaccharides using S. glaucescens 
GLA.O. 

35 

Medium 2: 



Soybean flour 20 g/L 



?W WW^^^^^W^^ ^ Ufa. {i 
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Starch 20 g/L 

pH 7.2 

Medium 3: 

5 

Soybean flour 20 g/L 

Glucose 20 g/L 

pH 7.2 

1 0 Example 9: Biotest using Mucor miehei 

A suspension of spores of the strain Mucor miehei was poured into agar 
(medium 5) (10 5 spores/ml), and 10 ml of this mixture were in each case 
poured into Petri dishes. Paper test disks (6 mm diameter) were loaded 

15 with 10 \i\ of acarbose [lacuna] (1 mg/ml) or with a sample from an S. 
glaucescens culture and laid on the test plates. The plates were then 
incubated at 37°C. Inhibition halos appeared on the starch-containing 
medium 5. A plate which was prepared with glucose (medium 4) instead of 
starch was used as a control. On this medium, no inhibition halo formed 

20 around the filter disks loaded with active compound. 

Media 4 and 5: 

KH2PO4 x 3 H 2 0 0.5 g 

25 MgS0 4 x7H 2 0 0.2 g 

NaCI 0.1 g 

Ammonium sulfate 5 g 

Yeast nitrogen base 1 .7 g 

Glucose (4) or starch (5) 5 g 

30 Agar 15 g 

Example 10: Transformation of S. glaucescens GLA.O 

Protoplasts of the Streptomyces glaucescens strain were isolated as 
35 described in Motamedi and Hutchinson ((1987) PNAS USA 84: 4445- 
4449), and the isolated plasmid DNA was transferred into the cells by 
means of PEG transformation as explained in Hopwood et ai. ((1985) 
Genetic manipulations of Streptomyces: a laboratory manual. The John 
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Innes Foundation, Norwich UK). The protoplasts were regenerated on 
R2YENG medium at 30°C (Motamedi and Hutchinson (1987) PNAS USA 
84: 4445-4449). After 18 h, the agar plates were overlaid with a 
thiostrepton-containing solution and incubated at 30°C (final concentration 
5 of thiostrepton: 20 fig/ml). 

Example 1 1 : isolation of the pseudo-oiigosaccharides from Streptomyces 
glaucescens GLA.O, HPLC analysis and mass spectroscopy 

10 Isolation 

The culture broth was separated from the mycelium by filtration. The 
culture filtrate which has been obtained in this way is then loaded onto an 
XAD1 6 column, after which the column is washed with water and the active 

15 components are eluted with 30% methanol. The eluate was concentrated 
down to the aqueous phase and the latter was extracted with ethyl acetate 
in order to remove lipophilic impurities. The aqueous phase was then 
concentrated and the active components were further purified in 5% 
methanol using a biogel P2 column. The individual fractions are collected 

20 in a fraction collector. The individual fractions were analyzed by means of 
the Mucor miehei biotest. Active eluates were rechromatographed, for 
further purification, in 5% methanol on biogel P2. The material which was 
isolated in this way was investigated by HPLC and MS. 

25 HPLC 

Column: Nucleosii® 100 C-18 
Eluent 0.1% phosphoric acid = A/acetonitrile = B 
Gradient: from 0 to 100% B in 15 min 
30 Detection: 215 nm 
Flow 2 ml/min 
Injection volume: 10-20 |il 

Using HPLC, it was not possible to distinguish the pseudo-oligosaccharide 
35 preparation from S. glaucescens GLA.O from authentic acarbose. Both the 
retention time and the UV absorption spectrum of the two components 
were identical in this eluent system. The pseudo-oligosaccharide mixture 
was not fractionated under these conditions. 
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Mass spectroscopic analysis (MS) 

The molecular weights and the fragmentation pattern of authentic acarbose 
and the pseudo-oligosaccharides isolated from Streptomyces glaucescens 
5 GLA.O were determined by means of electrospray MS. Analysis of the 
acarbose which is commercially obtainable from Bayer (Glucobay) gave a 
mass peak at 645.5 (acarbose). The purified samples from S. glaucescens 
GLA.O contain a mixture of different pseudo-oligosaccharides whose sugar 
side chains are of different lengths: 969 (acarbose + 2 glucose units), 807 
10 (acarbose + 1 glucose unit), 645 (corresponds to authentic acarbose). 
When acarbose and the compound which is isolated from S. glaucescens 
GLA.O and which has a molecular weight of 645 are fragmented, the same 
molecular fragments are formed, i.e.: 145 (4-amino-4,6-deoxygiucose), 303 
(Acarviosin) and 465 (303 together with one glucose unit). 

15 

Actinoplanes sp. SE50 also produces a mixture of acarbose molecules 
having sugar side chains of different length (Truscheit (1984) Vlllth 
International Symposium on Medicinal Chemistry, Proc. Vol 1. Swedish 
Academy of Pharmaceutical Sciences, Stockholm, Sweden). The length of 
20 the sugar side chains can be influenced by the choice of the fermentation 
parameters and of the substrate in the nutrient solution. 

Example 12; Southern hybridization using Actinoplanes sp. SE50/110 
(ATCC31044) 

25 

The chromosomal DNA was isolated from the strain Actinoplanes sp, 
SE50/100 and digested with restriction enzymes (Pstl and BamHI). A 
Southern hybridization was then carried out using a probe which 
encompasses the coding region of the dTDP-glucose 4,6-dehydratase 

30 acbD from Streptomyces glaucescens GLA.O (fragment d, Fig. 3). The 
probe hybridizes with distinct bands from Actinoplanes sp. SE50/110 
(Fig. 5, lanes 1 and 2). This provides the possibility of isolating the 
corresponding fragments from Actinoplanes sp. SE50/100 and other strain 
lines. Whether these DNA regions are in fact involved in the biosynthesis 

35 of acarbose remains to be demonstrated in subsequent investigations. 
Alternatively, the PGR primers 1 and 2 (Tab. 1) could also be used for 
amplifying the dTDP-glucose 4,6-dehydratase from Actinoplanes sp. 
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Legends: 

Fig. 1: Southern hybridization using S. glaucescens GLA.O. Lane 1: 
Pstl, lane 2: BamHI, lane 3: Bgili. The labeled PCR fragment 
5 HstrE was used as the probe. Labeling of DNA fragments 

which are involved in the biosynthesis of hydroxy- 
streptomycin. 



Fig. 2: Southern hybridization using S. glaucescens GLA.O. Lane 1: 
10 Pstl, [ane 2: BamHI, lane 3: Bglll. The labeled PCR fragment 

acbD was used as the probe. Labeling of DNA fragments 
which are involved in the biosynthesis of the pseudo- 
oligosaccharides. 

15 Fig. 3: Restriction map of the 6.8 kb Pstl fragment from 

Streptomyces glaucescens GLA.O . Open reading frames 
and the direction in which each is transcribed are indicated by 
arrows. The fragments a, b, c and d identify DNA regions 
which are explained in more detail in the text. 

20 

Fig. 4: Southern hybridization using Streptomyces glaucescens 

Aacb: lane 1: Pstl, lane 2: Pstl/Hindlll, and Streptomyces 
glaucescens GLA.O lane 3: Pstl, lane 4: Pstl/Hindlll. The 
labeled 1.0 kb Sstl fragment a (Fig. 3) was used as the 
25 probe. 



Fig. 5: Southern hybridization using Actinoplanes sp. SE50/100: 

lane 1: Pstl, lane 2: BamHI and Streptomyces glaucescens 
GLA.O lane 3: Pstl. The labeled 1.0 kb Smal/EcoRI fragment 
30 d (dTDP-giucose 4,6-hydratase, Fig. 3) was used as the 

probe. The arrows indicate the labeled DNA fragments 
(BamHI: 2.1 and 0.7 kb, Pstl: -1 1-12 kb) 



Tab. 4: DNA sequence of the 6.8 kb Pstl fragment from 
35 Streptomyces glaucescens GLA.O (SEQ ID NO.: 7). The 

deduced amino acid sequences (SEQ ID NO/. 8-13) of the 
identified open reading frames are given under the DNA 
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sequences. Start and stop codons 
binding sites are underlined. 
acbA: SEQ ID NO.: 8 
acbB: SEQ ID NO.: 9 
5 acbC: SEQ ID NO.: 10 

acbD: SEQ ID NO.: 11 
acbE: SEQ ID NO.: 12 
acbF: SEQ ID NO.: 13 



and potential ribosome 



Q 
III 

m 
m 
m 
w 

o 
m 

Q 

c 



WO 97/47748 



29 



PCT/EP97/02826 



Table 4: (SEQ ID NO.: 7, 8, 9, 10, 11, 12, 13) 



p 

8 

t 
I 

CTCCAGGGTTCCCTGGTGCACGACCCGCCCCTGGTCGACGACCAGGGCGCTGTCGGAGAT 

-H 1 , „ ., — ■-. — ».» —— .~~,~-»+~».»--~~-+ 60 

GACGTCCCAAGGGACXACGTGCTGGCCGGGGACCAGCTCCTGGT^ 

QI*TGQHVVRGQDVVI*ASDCI — 

CGCGGCGATGTCGCCCATGTCGTGGCTGGTGAGCACCACGGTGGTGCCCAGTTCCCGGTG 

k— + + + 120 

GCGCCGCTACAGCCGCTACAGGACCGACCACTCGTGGTGCCACCACGGGTGAAGGGCCAC 
AAIDAIDHSTLVVTTCLERH- 

GGCGCGGTTGACCAGCCGGCGCACCGCGTCCTTCAGCACCATGTCGACGCCGATCCTGGG 

+ +- H «h— + 180 

CCGCGCCAACTGGTCGGCCGCGTGGCGCAGGAAGTCGTGGTACAGCTCCGGCTAGCACCC 
ARNVLRRVAD KLVMDLG ITP- 

CTCGTCCCAGAACAGCACGGCCGGGTCGTCCACCACCCTCCCCGCXSATCTCGGCGCGCAT 

+ — „+ + + + 240 

GAGCACGCTCTTGTCGTGCCGGCCCAGCACGTCGTCCGAGCGGCGCTAGAGCCGCGCGXA 
EDWFLVAPDHLLSAAI E A R M - 

5 

P 

h 

I 

GCGCTGTCCGAGGCTGAGCTGCCGCAOGGGGGTGGACCCCAGCGCGTCGATGTCGAGGAG 

— + 4— 4- + 300 

CGCGACAGGCTCCGACTCGACGGCCTGCCCCCACCrGGCCTCGCCCAGCTACACCTCCTC 
RQCLSLQRVPTSGliAD IDX.I*- 

GTCCCGGAACAGGGCGAGGTTGCGCCGGTAGACCGGTCCGGGG ATGTCGTAGATCCCGCG ^ ^ 

CAGGGCCTTGTCCCGCTCCAACGCGGCCATCTGGCCAGGCCCCTACAGCATCTACGCCGC 
DRFLALNRRYVPGPIDYIRR- 



K 

P 

n 
I 

CAGGATGCCGAAGGAGTCGGGTACCGACAGGTCCCACCAGAGCTGGCTGCGCTGGCCGAA 

+ + + + + + 420 

GTCCTACGCCTTCCTCAGCCCATGGCTGTCCAGGGTGGTCTCGACCGACGCGACCGGCTT 
LIRFSDPVSLDWWLQSRQGF- 

CACGACCCCGATCGTGCGGGCGTTGCGCTGCCGGTGCCGGTAGCGCTCCAGCCCGGCGAC 

+ + 4> 4. 4- + 480 

CTGCTGCGGCTAGCACGCCCGCAACGCGACCGCCACGGCCATCCCGAGGTCGGGCCCCTG 
VVG I TRANRQRHRY PE LGAV- 

CGTGCAGCGGCCGGAGGTGGGGGTCATGATGCCGGTCAGCATCTTGATCGTGGTCGACTT 

+ + + + + + 540 

GCACGTCGCCGGCCTCCACCCCCAGTACTACGGCCAGTCGTaGAACTAGCACCAGCTGAA 
TCRGSTPTMIGTLMKI T T S K - 

GCCGGCTCCGTTGGCGCCGATGTAGGCGGTCTTCGTGCCGGCCGGTATCTCGAAGGAGAC 

+ — + 4- + + + 600 

CGGCCGAGGCAACCGCGGCTACATCCGCCAGAAGCACGGCCGGCCATAGAGCTTCCTCTG 
GAGNAGI YATKTCAPI EF5V- 
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K 

P 
n 

I 

GTCGTCGACGGCGCGCACGACGCGGTACCGGCGGGTCAGCAGGGTGCAGAGGCTGCCGAG 

4 — — 4- — — — — — -+ + 660 

DDVARVVRYRRTLLTSLSCI*- 

C»GGCCGGGCTCGCCTTCCGCCAGCCGGAACTCCTTGACCAGGTCTTCCCCCACCATCAC 

_H H—. — ■-- ~i -+ 720 

GTCCGGCCCCAGCGCAAGCCGGTCGGCCTTCAGGAACrrc 

* — 

L G PERSAX.RFSKVLHEAVXV- 

— — acbA 

GCGATCACCCGCTCGACGGCCGTCTCCAGCAGGCGCAGGCCCTCGTCGAGCAGCGCCTCG 

CGCTAGTGGGCGAGCTGCCGGCAGAGGTCGTCCGCGTCCGGGAGCAGCTCGTCGCGGAGC 
AIVREVATELI-RI-GEDLLAE- 

TCGAGGGTGAACGGCGGTGCCAGCCGCAGGATGTGGCCGCCCAGGGAGGTGCGCAGCCCC 

+ + + 4 + + 840 

AGCTCCCACTTGCCGCCACGGTCGGCGTCCTACACCGGCGGGTCCCTCCACGCGTCGGGG 
DLTFPPALRLIHGGLSTRLG- 

S 
m 
a 
I 

AGGTCGAGGGCGCTGGTGTAGACGGCCCGGGCGGTCTCGGGGGCGGGTGCCCGGCCSACG 

+ + + + + + 900 

TCCAGCTCCCGCCACCACATCTGCCGGGCCCGCCAGAGCCCCCGCCCACGGGCCGGCTGC 
LDLATTYVARATE PAPARGV- 

GCGTCGGTGACGAACTCCAGGCCCCACAGCAGTCCGAGGCCGCGTACCTGGCCGAGCTGG 

+ + + + + + 960 

CGCAGCCACTGCTTGAGGTCCGGGGTGTCGTCAGGCTCCGGCGCATGGACCGGCTCGACC 
ADTVFELGW.LLGLGRVQGLQ- 

S 
s 
t 
I 

GGGAAGCGGGACTCCAGGGCGCGCAGCCGCTCCTGGATGAGCTCGCCGAGGACGCGCACG 

+ + + + + + 1020 

CCCTTCGCCCTGAGGTCCCGCGCGTCGGCGAGGACCTACTCGAGCGGCrCCTGCGCGTGC 
PFRSELARLREQILEGLVRV- 

CGGTCGATCAGCCGGTCGCGCTCGACGACCTCCAGCGTGGCGCGGGCGGCGGCGATCCCC 

+ + + + + + 1080 

GCCAGCTAGTCGGCCAGCGCGAGCTGCTGGAGGTCGCACCGCGCCCGCCGCCGCTAGGGG 
EDI LRDREVVELTARAAAIG- 

S 
m 
a 
I 

AGTGGGTTGCTCGCGTACGTCGAGGCGTACGCCCCGGGGTGGCCGCCTCCGGCCTGCGCA 

+ + + + + 1140 

TCACCCAACGAGCGCATGCAGCTCCGCATGCGGGGCCCCACCGGCGGAGGCCGGACGCGT 
LPNSAYTSAYAGPHGGGAQA- 



WO 97/47748 31 PCT/EP97/02826 



GCTTCCGCGCGTCCGGCCAGCACGGCGAAGGGGAAXCCGCTCGCGGTGCCCTTGGACAGC 
H 4 4 h + + 1200 

CGJVkGGCGCGCAGGCCGGTCGTGCCGCTTCCCCtt 
AEARGALVAFPFGSATGKS L- 

ATCGCCAGGTCCGGCTCGATGCCGAACAGTTCGCTGGCGAGGAAGGCGCCGGTGCGCCCG 

^ ^ 4\ + + — — ♦ 1260 

TAGCGGTCCAGGCCGAGCTACGGCTTGTCAAGCGACCGCTCCTTCCGCGGCCACGCGGGC 
MALDPEIGFI#ESALFAGTRG- 

CCGCCGGTGAGGACCTCGTCGGCGACGAGCAGCACGCCGCCGTCCCGGCAGGCGCCGGCG 

+ H + + + + 1320 

GGCGGCCACTCCTGGAGCAGCCGCTGCTCGTCGTGCGGCGGCAGGGCCGTCCGCGGCCGC 
GGTLVE DAVLLVGGDR CAGA- 

ATCCGCTCCCAGTAGCCGGGGGGCGGCACGATGACGCCTGCCGCGCCGAGGACGGGTTCG 

+ + + + + + 1380 

TAGGCGAGGGTCATCGGCCCCCCGCCGTGCTACTGCGGACGGCGCGGCTCCTGCCCAAGC 
IREWYGPPPV IVGAAGLVPE- 

AAGACCAGGGCCGAGACGTTGGGCTTCTCCGCGATGTGCCGGCGCACGAGGGTCGCGCAC 

+ + + + + + 1440 

TTCTGGTCCCGGCTCTGCAACCCGAAGAGGCGCTACACGGCCGCGTGCTCCCAGCGCGTG 
FVLASVNPKEAI HRRVLTAC- 

CGCACGTCGCACGAGGGGTACTCCAGGCCCAGGGGACAGCGGTAGCCAGTAGGGGCTGTA 

+ + + + + + 1500 

GCGTGCAGCGTGCTCCCCATGAGGTCCGGGTCCCCTGTCGCCATCGGTCATCCCCGACAT 
RVDCSPYELGLPCRYGTPAT- 

GCCAGCACGCTGTTGCCGCTGAAGGCCTGGTGGCCGATGTCCCAGTGGACCAGCATCCGG 

4 + + + + + 1560 

CGGTCGTGCGACAACGGCGACTTCCGGACCACCGGCTACAGGGTCACCTGGTCGTAGGCC 
ALVSNGSFAgHGlDWHVLMR- 

GCGCCCATGGTCTTGCCGTGGAAGCCGTGGCGCAGGGCGCAGATCCGGTTGCGGCCCGGC 

+ + + + + + 1620 

CGCGGGTACCAGAACGGCACCTTCGGCACCGCGTCCCGCGTCTAGGCCAACGCCGGGCCG 
AGMTKGHFGHRLACI RNRGP- 

GCGGCGGTCGCCTGGACGACCCGCAGGGCGGCCTCGACCACCTCCGCGCCGGTGGAGAAG 

+ + + -f- + + 1680 

CGCCGCCAGCGGACCTGCTGGGCGTCCCGCCGGAGCTGGTGGAGGCGCGGCCACCTCTTC 
AATAQVVRLAAEVVEAGTS F - 

AAGGC GTAGGT GT C GAGCT GTT CGGGCAGCAG C CT GGCGAGCAGTT C CAGCAGG C C GGC G 

+ + + + + + 17 40 

TTCCGCATCCACAGCTCGACAAGCCCGTCGTCGGACCGCTCGTCAAGGTCGTCCGGCCGC 
FAYTDLQEPLLRALLELLGA- 



CGGTCCGGCGTGGCGCTGTCGTGGACGTTCCACAGGCGGCGGGCCTGGGTGGTGAGTGCC 

+ + + + + -v 1800 

GCCAGGCCGCACCGCGACAGCACCTGCAAGGTGTCCGCCGCCCGGACCCACCACTCACGG 
RDPTASDHVNWLRRAQTTLA- 

TCGACGACCTCCGGGTGCCCGTGGCCCAGTGACTGGGTGAGGGTCCCGGCCGCGAAGTCG 

+ + + + + + I860 

AGCTGCTGGAGGCCCACGGGCACCGGGTCACTGACCCACTCCCAGGGCCGGCGCTTCAGC 
EVVEPHGHGLSQTLTGAAFD- 



WO 97/47748 32 PCT/EP97/02826 



AGGTACTGGTTGCCGTCCAGGTCGGTCAGAACGGGACCGCGTCCCTCGGCGAAGACCCGG 

+ H + 4 + + 1920 

TCCATGACCAACGGCAGGTCCAGCCJUrrCTTTCCCTGGCGCAGGGAGCCGCTTCTGGGCC 
LYQNGDLDTLVPGRGEAFVR- 

CGTCCGTGGACGGCTTCCTCGGAGGCGCCCGGCGCCAGGTGGCGGGCCTCCCGTGCCAGG 

4— + h +- + + 1980 

GCAGGCACCTGCCGftAGGAGCCTCCGCGGGCCGCGGTCCACCGCCCGGAGGGCACGGTCC 
RGHVAEESAGPALHRAERAL- 



TGC'rCrTUT C TGCCGTAAGC C T G TCATCGCTGCC T CTGCTCGTCGGACCGGCTGACGCGAT 
H 4 + h + + 2040 

ACGACACAGACGGCSITTCGGACACTAGCGA CGG 
HQTQRLGTM 

— «cbB 

CGCCGGGGAACTGCGTTGTGGCGCACGAGGGTTGGGGCGGCTCGGCGCTGAGTCAAAGAC 

GCGGCCGCTTGACGCAACACCSOGTGGTGCCAACCCCGCCGAGCCGCGACTCAGTTTGTG 

TTCAACACACACCGCTCCAAGAGTTTGGGGGTTGTTTCAGAAAGTTGTTGCGAGCCGCCC 

+ 4 h + + + 2160 

AACTTGTGTCTGGCGACGTTCTCAAACGCCCAAGAAAGTCTTTGAACAACGCTCGCCGGG 

CGGCACTCTGGTTGAGTCGACGTGCTTACGGCGCCACCACGCCTCACGTTCGAGGAGGGA 

— ~+ — ———*.— — -4- 4- — + 222 0 

GCCCTGAGACCAACTCACCTGCACGAATCCTOCGGTGCTGCGGACTGCAAGCTCCTCCCT 



CCTGTGAGAACAAGCCCGCAGACCGACCCGCTCCCG CGGAGG CCGAGGTGAAGGCCCTGG 

GGACACTCTTGTTCGGGCGTCTGGC7GGGCGAGGGCGCCTCCGGCTCCACTTCCGGGACC 

V K A V V - 
acbC 



P 
v 

u 

I 
I 

TCCTGG CAGGTGG AACCGG CAGCAG ACTG AGGCCG TTCACCCAC ACCGCCG CCAAG C AG C 

AGGACCGTCCACCTTGCCCGTCGTCTCACTCCGGCAAGTGGGTGTGGCGGCGGTTCGTCC 
LAGCTCSRLRPFTHTAAKQL- 

TGCTCCCCATCGCCAACAAGCCCGTGCTCTTCTACGCGCTGGAGTCCCTCGCCGCGGCGG 

ACGAGGGGTAGCGGTTGTTCGGGCACGAGAAGATGCGCCACCTCAGGGAGCGGCGCCGCC 
LPIANKPVLFYALESLAAAG- 

GTGTCCGGGAGGCCGGCGTCGTCGTGGGCGCGTACGGCCGGGAGATCCGCGAACTCACCG 
+ 4-— 4— + + 2460 

cacaggccctco;gccgcagcagcacccx;cgcatcccggccctctaggcgcttgagtggc 

VREAGVVVGAYGREI RELTG- 

GCGACGGCACCGCGTTCGGGTTACGCATCACCTACCTCCACCAGCCCCGCCCGCTCGGTC 

CGCTGCCGTGGCGCAAGCCCAATGCGTAGTGGATGGAGGTGGTCGGGGCGGGCGAGCCAG 
DGTAFGLRITYLHQPRPLGL- 

TCGCGCACGCGGTGCGCATCGCCCGCGGCTTCCTGGGCGACGACGACTTCCTGCTGTACC 

4- 4- + 4- 4- + 2580 

AGCGCGTGCGCCACGCGTAGCGGGCGCCGAAGGACCCGCTGCTGCTGAAGGACGACATGG 
AHAVRIARG FLGDDD FLLY L - 
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TGGGGGAC^CTACCTGCCCCAGGGCGTCACCGACTTCGCCCGCCMTC^ 

+ h k + t» + 2640 

ACCCCCTGTTGATGGACGGGGTCCCGCAGTGGCTGAAGCGGGCGGTTAGCCGGCGGCTAG 
GD NYLPQGVTD F A R Q S AAD 

CCGCGGCGGCCCGGCTGCTGCTCACCCCWTCGCGGACCCGTCCGCCrc^ 

— ■ + ■ ■ - ■ | ■ + 4 + 2700 

CCCGCCCCCGGGCCGACGACGAGTGGGGCCAGCGCCTGGGCAGGCGGAAGCCGCAGCGCC 
AAARI*LLTPVADP SAFGVAZ — 

AC^TCGACGCCGACGGGAACGTGCTGCGCTTGGAGGAGAAACCCGACCTCCCGC^CAGCT 

+ — , k + 2760 

TCCAGCTCCGCCTGCCCTTGCACGACGCGAACXTCCTCTTT G GGCTGCAGGGCCCGTCCA 
VDADGNVLRLEEXPDVPRS S - 

CGCTCGCGCTCATCGGCGTGTACGCCTTGAGCCCGGCCCTCCACGAGGCGGTACGGGCCA 

1 + h -h 1 -+ 2820 

GCGAGCGCGAGTAGCCGCACATGCGGAAGTCGGGCCGGGAGGTGCTCCGCCATGCCCGGT 
LALIGVYAPSPAVHEAVRAI- 

tcaccccctccgcccgccgccagctggagatcacccacgccgtgcagtggatgatcgacc 

agtgggggaggcgggcgccgctccacctctagtgggtccggcacx;tcacctactagctc 
t p s a r g e l e i thavqwm i d r — 

ggggcctgcgcgtacgggccgagaccaccacccggccctggcgcgacaccggcagcgcgg 

+ + + + + + 2940 

ccccggacgcgcatgcccggctctggtggtgggccgggaccgcgctgtgcccgtcgcgcc 
clrvraetttrpwrotcsae- 

aggacatgctc^aggtcaaccgtcacgtcctggacggactggagggccgcatcgagggga 

tcctgtacgacctccagttggcagtgcaggacctgcctgacctcccggcgtagctcccct 
dmlevnrhvldglegriegk- 

aggtcgacgcgcacagcacgcxggtcggccgggtccgggtggccgaaggcgcgatcgxgc 

+ + + + — + + 3060 

tccagctgcgcgtgtcgtgcgaccagccggcccaggcccaccggcttccgcgctagcacg 
vdahsti*vgrvrvaega ivr- 

gggggtcacacgtggtgggccccgtcgtgatcggcgcxkxstgccgtcgtcagcaactcca 

► + + + + 3120 

CCCCCACTGTGCACCACCCGGGCCACCACTAGCCGCCCCCACGGCAGCAGTCGTTGAGGT 

gshvvgpvvigagavvsnss- 
gtgtcggcccgtacacctccatcgggcaggactgccgggtcgaggacagcgccatcgagt 

+ + + + + 3180 

cacagccgcccatctggaggtagcccctcctgacgccccagctcctgtcgcggtagctca 

VGPYTSIGEDCRVEDSAIEY- 

ACTCCGTCCTGCTGCGCGGCGCCCAGGTCGAGCGGGCGTCCCGCATCGAGGCGTCCCTCA 

+ + + +_„ — — — + + 3240 

TGAGCCACGACGACGCGCCCCCGGTCCAGCTCCCCCGCAGCGCGTAGCTCCGCAGGGAGT 
SVLLRGAQVECASRIEASLI- 

TCGGCCG CGGCGCCGTCGTCGGCCCGGCCCCCCGTCTCCCGCAGGCTCACCGACTGGTGA 

+ + + + + + 3300 

ACCCGGCCCCGCGGCAGCAGCCGGCCCGGGGGGCAGAGGGCGTCCGAGTGGCTGACCACT 
GRGAVVGPAPRLPQAHRLVI- 
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^^^^ACCACACCAACCTGTATCTCACCCC^J^^CCACCACCATCCTCGTCACCGCCTO 

« K ^ h + ^ + 3360 

AOCCGCTGGTGTCGTXCCACIAXAGAGTCGCGTACTCGTGCTGCTAGGAGCAGTGGCCCCC 

HTTTXLVTGG- 
GDHSJCVYLTP* 

acbD — 

s 

m 
a 

AGCGGGCTTCATTCGCTCCGCCTACGTCCGCCGGCTCCTGTCGCCCGGGGCCCCCGGCGG 

TCGCCCGAAGTAAGCGAGGCGGATGCAGGCGGCCGAGGACAGCGGGCCCCGGGGGCCGCC 3 ^ ° 
A G F I R5AYVRRLI*S PGAPGG- 

^^^^^^^^^CCGTCCTCGACAAACTCACCTACGCCGGCAGCCTCGCCCGCCTGCACGC 

- + _„ +w „ 4 + _ -f. 348O 

GCAGCGCCACTGGCAGGAGCTGTTTGAGTGGATGCGGCCGTCGGAGCGGGCGGACGTGCG 
VAVTVLDKLTYAGSLARLHA- 

GGTGCGTGACCATCCCGGCCTCACCTTCGTCCAGGGCGACGTGTGCGACACCGCGCTCGT 

+ + + ^ ^ + 3540 

CCACGCACTGGTAGG^3CCGGAGTGX5AAGCAGGTCCCGCTGCACACGCTGTGGCGCGAG 
VRDHPGLTFVQGDVCDTALV — 

CGACACGCTGGCCGCGCGGCACGACGACATCGTGCACTTCGCGGCCGAGTCGCACGTCGJV 

H + + ^ ^ + 360Q 

GCTGTGCGACCGGCGCGCCGTGCTGCTCTAGCACGTGAAGCGCCGGCTCAGCGTGCAGCT 
DTLAARHDDIVHFAAESHVD- 

CCGCTCCATCACCGACAGCGCTGCCTTCACCCGCACC^^ 

"* + + + + + 3660 

GGCGAGGTAGTGGCTGTCGCCACGGAAGTGGGCGTGGTTGCACGACCCGTGGGTCCAGGA 
RS ITDSGAFTRTNVI*GTQV1,— 

GCTCGACGCCTCGCTCCGCCACGGTGTG^GCACCTTCCTGCACGTCTCCACCGACGAGX^ 

CGAGCTGCGGCGCGAGGCGGTGCCACACGCGTGGAAGCACGTGCAGAGGTGGCTGCTCCA 3720 
LDAALRHGVRTFVHVSTDEV- 

GTACGGCTCCCTCCCGCACGGGGCCGCCGCGGAGAGCGACCCCCTGCTTCCGACCTCGCC 

CATGCCGAGGGAGGGCGTGCCCCGGCGGCGCCTCT CGCTGGGGGACGAAGGCTGGAGCGG 3780 
YGSLPHGAAAESDPLLPTSP- 

GTACGCGGCGTCGAAGGCGGCCTCGGACCTCATGGCGCTCGCCCACCACCGCACCCACGG 

CATGCGCCG^AGCTTCCGCCGGAGCCTGGAGTACCGCGAGCGGGTGGTGGCGTGGGTGCC 3 8 ^ ° 
YAASKAASDLMALAHHRTHG- 

CCTGGACGTCCGGGTGACCCGCTGTTCGAACAACTTCGXaCCCCCACCA.G^ 

GGACCTGCAGGCCCACTGGGCGACAAGCTTGTTGAAGCCGGGGGTGGTCGTAGGGCTCTT 3900 
LDVRVTRCSNNFGPHQHPEK- 

GCTCATACCGCGCTTCCTGACCAGCCTCCTGTCCGGCGGCACCGTTCCCCTCTACGGCGA 

CGAGTATGGCGCGAAGGACTGGTCGGAGGACAGGCCGCCGTGGCAAGGGGAGATGCCGCT 3 
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TCGGCGACCACAGCAAGGTGTATCTCACCCC£I^CCACGACCATCCTCGTCACCGGCCC 
k_ h h + i + 33 6 o 

AGCCGCXCGTGTCCITCCAC^TAGAGTGGGGTACTGGTGCTCGTAGGAGCAGTGGCCGCC 

MTTTILVTGG- 
GDHSKVYLTP* 

acbD ■ 

s 

m 
a 

I 

AGCGGGCTTCATTCGCTCCGCCTACGTCCGCCGGCTCCTGTCGCCCGGGGCCCCCGGCGG 

+ ^ h + + 3420 

TCGCCCGAAGTAATCGAGGCGGATGCAGGCGGCCGAGGACAGCGGGCCCCGGGGGCCGCC 
AGFIRSAYVRRLIiSPGAPGG- 

CGTCGCGGTGACCGTCCTCGACAAACTCACCTACGCCGGCAGCCTCGCCCGCCTGCACGC 

^ * h H + + 3480 

GCAGCGCCACTGGCAGGAGCTGTTTGAGTGGATGCGGCCGTCGGAGCGGGCGGACGTGCG 
VAVTVLDKLTYAGSLARLHA- 

GGTGCGTGACCATCCCGGCCTCACCTTCGTCCAGGGCGAC GTG TGCGACACCGCGCTCGT 
+ -i + +_ + 3540 

CCACGCACTGGTAXK5GCCGGAGTGGAAGCAGGTCCCGCTGCACACGCTGTGGCGCGAGCA 
VRDHPGLTFVQGDVCDTALV- 

CGACACGCTGGCCGCGCGGCACGACGACATCGTGCACTTCGCGGCCGAGTCGCACGTCGA 

■* + + + -i + 3600 

GCTGTGCGACCGGCGCGCCGTGCTGCTCTAGCACGTGAAGCGCCGGCTCAGCGTGCAGCT 
DTLAARHDDIVH FAAES HVD - 

CCGCTCCATCACCGACAGCGGTGCCTTCACCCGCACCAACGTGCTGGGCACCCAGGTCCT 

h + + + n + 3660 

GGCGAGGTAGTGGCTGT CGCCACGGAAGTGGGCGTGGTTGCACGACCCGTGGGTCCAGGA 
RSITDSGAFTRTNVLGTQVL- 

GCTCGACGCCGCGCTCCGCCACGGTGTGCGCACCTTCGTGCACGTCTCCACCGACGAGGT 
-t + + + + + 3720 

CGAGCTGCGGCGCGAGGCGGTGCCACACGCGTGGAAGGACGTGCAGAGGTGGCTGCTCCA 
LDAALRHGVRTFVHVSTDEV- 

GTACGGCTCCCTCCCGCACGGGGCCGCCGCGGAGAGCGACCCCCTGCTTCCGACCTCGCC 
+ + ^ + + + 3790 

CATGCCGAGGGAGGGCGTGCCCCGGCGGCGCCTCTCGCTrGGGGGACGAAGGCTGGAGCGG 
YG5LPHGAAAESDPLLPTSP- 

GTACGCGGCGTCGAAGGCGGCCTCGGACCTCATGGCGCTCGCCCACCACCGCACCCACGG 
+ + + ^ + + 3840 

CATGCGCCGCAGCTTCCGCCGGAGCCTGGAGTACCGCGAGCGGGTGGTGGCGTGGGTGCC 
YAASKAASDLMALAHHRTHG- 

CCTGGACGTCCGGGTGACCCGCTGTTCGAACAACTTCGGCCCCCACCAGCATCCCGAGAA 
+ + + + + + 3900 

GGACC7GCAGGCCCACT GGGCGACAAGCTTGTT GAAGC C GGGGGTGGT CGTAGGGCTCTT 
LDVRVTRCSNNFGPKQHPEK- 

GCTCATACCGCGCTTCCTGACCAGCCTCCTGTCCGGCGGCACCGTTCCCCTCTACGGCGA 
+ + ^ + + + 3960 

CGAGTATGGCGCGAAGGACTGGTCGGAGGACAGGCCGCCGTGGCAAGGGGAGATGCCGCT 
LI PRFLTSLLSGGTVPLYGD- 
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CGGCCGGCACCTGCGCGACTGGCTGCACGTCGACGACCACGTCAGGGCCGTCGAACTCGT 
h + + + i + 4020 

GCCCGCCGTGCACGCGCTGACCGACGTGCAGCTGCTGGTGCAGTC^ 
GRHVRDVLHVDDHVRAVELV- 

B 

9 
1 
I 
I 

CCGCCTGTCGGGCCGGCCGGGAGAGATCTACAACATCGGGGGCGGCACCTCGCTGCCCAA 

4 + m + + + 40B0 

GGCGGACAGCCCGGCCGGCC C T C TCT A GRX G1 T G T A GCCCCCGCCGTGGAGCGACGGGTT 
RVSGRPGEIYNIGGGTSLPN- 

S 
s 
t 
I 

CCTGGAGCTCACGCACCGGrrTGCTCGCACTGTGCGGCGCGGGCCCGGAGCGCJITCGTCCA 
+ + + + + + 4140 

GGACCTCGAGTGCGTGGCCAACGAGCGTGACACGCCGCGCCCGGGCCTCGCGTAGCAGGT 
LELTHRLLALCGAGPE R I V H - 

C GTC GAGAAC CGCAAGGGGCACGACC GGCGCTAC GCGGT C GACCACAGCAAGAT CACC GC 

+ + i + + + 4200 

GCAGCTCTTGGCGTTCCCCGTGCTGGCCGCGATGCGCCAGCTGGTGTCGTTCTAGTGGCG 
VENRKGHDRRYAVDHSKITA- 



N 
r 
u 
I 

GGAACTCGGTTACCGGCCGCGCACCGACTTCGCGACCGCGCTGGCCGACACCGCGAAGTG 

+ + + + + 4260 

CCTTGAGCCAATGGCCGGCGCGTGGCTGAAGCGCTGGCGCGACCGGCTGTGGCGCTTCAC 
E X# G Y R PRTDFATALADTAKW - 

GTACGAGCGGCACGAGGACTGGTGGCGTCCCCTGCTCGCCGCGACATGACGTCGGGCCGG 

+ + + + + + 4320 

CATGCTCGCCGTGCTCCTGACCACCGCAGGGGACGAGCGGCGCTGTACTGCAGCCCGGCC 
Y ERHEDWWRPLLAAT* 

ACCGCAACCACCGGCCCCGGCCGGCACACCGCCGCCCGCGGCCGGTGGCCGGCCGGTCAG 
+ + + + + + 4380 

TGGCGTTGGTGGCCGGGGCCGGCCGTGTGGCGGCGGGCGCCGGCCACCGGCCGGCCAGTC 



CGTCCGTGAGCCGGGCGCCGGCCGCCCCGCGGGCCGGCGGCGGTGGACCCCCGGACCACC 
+ + + + + + 4440 

GCAGGCACTCGGCCCGCGGCCGGCGGGGCGCCCGGCCGCCGCCACCTGGGGGCCTGGTGG 
RGHAPRRGGRPGAATSGRVV- 

E 
c 
o 
R 

I 

AGTTCCGGCATGAAGACGAATTCGGTGCGCGGCGGCGGCGTTCCGCTCATCTCCTCCAGC 
+ + + + + + 4500 

TCAAGGCCGTACTTCTGCTTAAGCCACGCGCCGCCGCCGCAAGGCGAGTAGAGGAGGTCG 
LEPMFVFETRPPPTGSMEEL- 
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AGTGCGTCCACGGCGTVCCTGCCCCATCGCCTTGACGGGCTGTCTGATGGrGGTCAGGGGA 

+ + h + 4 + 4560 

TC^CGCAGGTMCGCTGGACGGGGTAGCGGAACTGCCCGJVCAGJVCTACOICCAGTCCCCT 
1-ADVAVCGMAKVPQRITTLP- 

C»GCTCGGTGAAGGCC^TGAGCGGCGAGTCGTCGAATC 

CCCAGCCACTTCCCSGTACTCGCCGCTCAXSCAGCTTCGGC^SGTGGCTCTACAGTGGCCCT 
PDTFAMLPSDDFGVVS IDGP- 

ACCGTGAGACCCCGCCGGCCCGCGGCCCGCACGGCGCCGAGGGCCATCATGTCGCTGGCG 

* +- h + + 4680 

TGGCACTCTGGGGCGGCCGCGCGCCGGGCGTGCCGCGGCTCCCGGTAGTACAGCGACCGC 
VTLGRRRAARVAGLAMMDSA- 

CACATGACGGCGGTGCAGCCCAGGTCGATCAGCGCGGACGCGGCGGCCTGGCCCCCCTCC 

CTGTACTGCCGCCACGTCGGGTCCAGCTAGTCGCGCCTGCGCCGCCGGACCGGGGGGAGG 
CMVATCGLDILASAAAQGGE- 

S 
5 
t 
I 

AGGGAGAACAGCGAGTGCTGCACGAGCTCCTCGGACTCCCGCGCCGACACTCCCAGGTGC 
+ + + + ^ + 

TCCCTCTTGTCGCTCACGACGTGCTCGAGGAGCCTGAGGGCGCGGCTGTGAGGGTCCACG 
LSFLSHOVLEESERASVGLH- 

TCCCGCACGCCGGCCCGGAACCCCTCGATCTTCCGCTGCACCGGCACGAAGCGGGCGrGGC 

+ -+ + + + + 4860 

AGGGCGTGCGGCCGGGCCTTGGGGAGCTAGAAGGCGJICGTGGCCGTGCTTCGCCCGCCCG 
ERVGARFGEIKRCVPVrRAP- 

CCGACGGCGAGGCCGACGCGCTCGTGCCCCAGCTCCGCCAGGTGCGCCACGGCCAGGCGC 
+ + + + + + 

GGCTGCCGCTCCGGCTGCGCGAGCACGGGGTCGAGGCGGTCCACGCGGTGCCGGTCCGCG 
GVALGVREKG1.EALHAVALR- 

ATCGCGGCCCGGTCGTCCGGGGAGACGAAGGGTGCCTCGATCCGGGGCGAGAACCCGTTC 

+ + + + + + 4 980 

TAGCGCCGGGCCAGCAGGCCCCTCTGCTTCCCACGGAGCTAGGCCCCGCTCTTGGGCAAG 
MAARDDPSVFPAEIRPS FGN- 

ACGAGGACGAAGGGCACCTGCCGCTCGTGCAGCCGGCCGTACCGTCCGGTCTCGGCGCTG 

- H ^ ^ + + ^ 5040 

TGCTCCTGCTTCCCGTGGACGGCGAGCACGTCGGCCGGCATGGCAGGCCAGAGCCGCCAC 
VLVFPVQREHLRGYRGTEAT- 

GT GT C C GC GT GCAGT C C GGAGAC GAAGATGAT G C CGGACAC C C C GC GGT C CACGAGCAT C 

+ + + + + + 5100 

CACAGGCGCACGTCAGGCCTCTGCTTCTACTACGGCCTGTGGGGCGCCAGGTGCTCGTAG 
TDAHLGSVFI IGSVGRDVLM- 

S 
m 
a 
I 

TCCGTGAGTTCGTCCTCGGTCGAGCCGCCCGGGGTCTGCGTGGCGAGCACGGGCGTGTAG 

+ + ^ ^ + + 51fi0 

aggcactcaagcaggagccagctcggcgggccccagacgcaccgctcgtgcccgcacatc 
etledetsggptqtalvpty- 
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CCCTGJVCGCGTGAGCGCCTGCCCCATCACCTGGGCCAGTGCTCGGAAGAAGGGGTTGTCC 
< h 4 + h + 5220 

GGGACTGCGCACTCGCGGACGGGGTAGTGGACCCGGTCACGCCCCTTCTTCCCCAACAGG 
GQRTLAQGMVOALAPFFPND* 

AGTTCGGGGGTGACCAGTCCGACCAGCTCGGCGCGGCGCTGTCGCGCCGGCTGCTCGTAG 

«t + — 4— + 4* + 5280 

TCAAGCCCCCACTGGTCAX5GCTGCTCGAGCCTCGCCGCGACAGCGCGGCCGA 
LEPTVLGVLBARRQRAPQEY- 

CCCATCGCGTCCAGTCKIGOTCAGCACCGAGTCGCG^OTG^CGGTGGCCACACCGCGCGCA 
+ + H + + + 5340 

GGGTCG^GCAGGTCACGCCAGTCGTGGCTCAGCGCCCACGGCCACCGGTGTGGCGCGCGT 
GLADLATLVS DRTGTAVGRA- 

S 

in 
a 

I 

CCGTTCAGCACCCGGCTGACCGTGGCCTTGCTGACGCCCGCCCGGGCTGCGATGTCGGCG 
-i +. + + + + 5400 

GGCAAGTCGTGGGCCG^CTGGCACCGGAACGACTGCGXSGCGGGCCCGACGCTACAGCCGC 
GNLVRSVTAKSVGARAAIDA- 

AGCCGCATGGTCATGGCAACGCACTCTACCTGTCGGGGCGTCAGGGCGTGCCCACCGCGC 

A + i + + + 5460 

TCGGCGTACCAGTACCGTTGCGTGAGATGGACAGCCCCGC^GTCCCGCACGGGTGGCGCG 
L R M T M 
— — acbE 

GCG<3AACCGGCGGACTGCG«GGCACGGCCCGTCCGCCGCCCACG^CCACGCGCCCGAAA 
+ + + + + + 5520 

CGCCTTG^CCGCCTGACGCCCCGTGCCGGGCAGGCGGCGGGTG^CCTGGTGCGCGGC CT TT 

C GAT GGCT GAAAAT GCTTGCAGCAAATT GCC GCAACGTU TTT C GGC GGCTTTT C GATCCT 
+ + + + + + 5580 

GCTACCGACTTrrACGAACGTCGTTTAACGGCGTTGCAGAAAGCCGCCGAAAAGCTAGGA 

OTTACGTTCCTGGCAACCCCGGCGCCGCGCAGAAGCGGTTGGCGTGAGGCGTCCAGACCT 
+ + + + + + 5640 

CAATGCAAGGACCGTTGGGGCCGCGGCGCGTCTTCGCCAACCGCACTCCGCAGGTCTGGA 

CCG^CCGATTCCGGGATCACTCAG G^GAGT TCAC AATG CGGCGTGGCATTGCGGCCACCG 
+ + + + + + 5700 

GGCGGGCTAAGGCCCTAGTGAGTCCCCTCAAGTGTTACGCCGCACCGTAACGCCGGTGGC 

MRRGIAATA- 
acbF 

CGCTGTTCGCGGCTGTGGCCATGACGGCATCGGCGTGTGGCGXK^GGCGACAACGGCGGAA 

+ + + + + + 5760 

GCGACAAGCGCCGACACCGGTACTGCCGTAGCCGCACACCGCCCCCGCTGTTGCCGCCTT 
LFAAVAMTASACGGGDNGGS- 

K 

P 
n 
I 

GCGGTACCGACGCGGGCGGCACGGAGCTGTCGGGGACCGTCACCTTCTGGGACACGTCCA 
+ + + + + + 5Q20 

CGCCATGGCTGCGCCCGCCGTGCCTCGACAGCCCCTGGCAGTGGAAGACCCTGTGCAGGT 
GT DAGGTE LS GTVT FWDT S N - 
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ACGAAGCCGAGAAGGCGACGTACCAGGCCCTCGCGGAGGGC^CGAGAAGGAGCACCC« 
—i + h + + 5880 

TGCTTCGGCTCTTCCGCTGCATGCTCCGGGAGCGCCTCCCGAAGCTCTTCCTCGTGGGCT 
EAEKATYQALAE GFEKEHPK- 

AGGTCGACGTCAAGTACGTCAACGTCCCGTTCGGCGAGGCGAACGCCAAGTTC^AGAACG 

™ — h — +— 4 + + 5940 

TCCAGCTGCAjGTTCATGCAGTTGCAGGGCAAGCCGCTCCGCTTGCGGTTCAACTTCTTGC 
VDVKYVHVp F GEANAKFKNA- 

CCGCGGGCGGCAACTCCGGTGCCCCGGACGTGJITGCGTACGGAGGTCG^CTGGGTCGCTC 

A + h + + + 6000 

GGCGCCCGCCGTTGAGGCCACGGGGCCTGCACTACGCATGCCTCGAGCGGRCCCAGCGCC 
AGGNS GAFDVMRTEVAWVAD- 

ACTTCGCCAGCATCGGCTACCTCGCCCCGCTCGACGGCACGCCCGCCCTCGACGACGGGT 
+ + ^ A + + €060 

TGAAGCGGTCGTAGCCGATGGAGCGGGGCGAGCTGCCGTGCGGGCGGGAGCTGCTGCCCA 
FAS IGYLAPLDGTPALDDGS — 

CGGACCACCTTCCCCAGGGCGGCAGCACCAGGTACGAGGGGAAGACCTACGCGGTCCCGC 
+- + + + + + 61 20 

GCCTGGTGGAAGGGGTCCCGCCGTCGTGGTCCATGCTCCCCTTCTGGATGCGCCAGGGCG 
DHLPQGGSTRYEGKTYAVPQ- 

AGCT GATC GACACC CT GGCGCTCTT CTACAACAAGGAACT GCT GAC GAAGGCCGGTGT CG 

+ + 4 4 ♦ ♦ 6180 

TCCACTAGCTGTGGGACCGCGAGAAGATGTTGTTCCTTGACGACTGCTTCCGGCCACAGC 
V I DTLALFYN KE LLTKAGVE- 

AGGTGCCGGGCTCCCTCGCCGAGCTGAAGACGGCCGCCGCCGAGATCACCGAGAAGACCG 

+ + + + + + 6240 

TCCACGGCCCGAGGGAGCGGCTCGACTTCTGCCGGCGGCGGCTCTAGTGGCTCTTCTGGC 
VPGSLAE I* K T A A A E I TEKTG — 

GCGCGAGCGGCCTCTACrrGCGGGGCGACGACCCGTACTTGGTTCCTGCCCTACCTCTACG 

+ + + H + + 6300 

CGC GCTCGC C GGAGAT GACGC C C C GCTGCT GGG CAT GAAC CAAGGAC GGGAT G GAGAT GC 
ASGLYCGATTRTWFLPYLYG- 

GGGAGGGCGGCGACCTGGTCGACGAGAAGAACAAGACCGTCACGGTCGACGACGAAGCCG 

-k t- + + + + €360 

CCCTCCCGCCGCTGGACCAGCTGCTCTTCTTGTTCTGGCAGTGCCAGCTGCTGCTTCGGC 
EGGDLVDEKNKTVTVDDEAG- 

GTGT GC GC GC CT AC CGC GT CAT CAAGGAC CT CGT GGAC AGCAAGGC G G C CAT C AC C GAC G 
+ + + + + -4- 6420 

CACACGCGCGGATGGCGCAGTAGTTCCTGGAGCACCTGTCGTTCCGCCGGTAGTGGCTGC 
VRAYRV IKDLVDSKAAITDA- 

CGTCCGACGGCTGGAACAACATGCAGAACGCCTTCAAGTCGGGCAAGGTCGCCATGATGG 

+ + + + + + 6480 

GCAGGCTGCCGACCTTGTTGTACGTCTTGCGGAAGTTCAGCCCGTTCCAGCGGTACTACC 
SDGWNNMQNAFKSGKVAMMV- 

TCAACGGCCCCTGGGCCATCGAGGACGTCAAGGCGGGAGCCCGCTTCAAGGACGCCGGCA 

+ + + + + + 6540 

AGTTGCCGGGGACCCGGTAGCTCCTGCAGTTCCGCCCTCGGGCGAAGTTCCTGCGGCCGT 
NGPWAI EDV K A G A R FKDAGN- 
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ACCTGGGCCTCGCCCCCGTCCCGGCCGGCAGTGCCGGACAGGGCTCTCCCCAGGGCGGGT 

+ + + + + 6600 

TGGACCCCCAGCSGGGGCAGGGCCGGCCGTCACGGCCTGTCCCGAGaGGGGTCCCGCCCA 
LGVAPVPAGSAGQGS PQ5GW- 

GGAACCTCTCGGTGTACGCGGGCTCGAAGJVACCTCGACGCCTCCTACGCCTTCGTGAAGT 

4 h h ^ + + 6660 

CCTTGGAGAGCCACATGCGCCCGAGCTTCTTGGAGCTGCGGAGGATGCGGAAGCACTTCA 
NLSVYAGSKNLDASYAFVKY- 

S 

8 

t 
I 

ACATGAGCT CCGCCAAGGTGCAGCAGCAGACCACCGAGAAGCTGAGCCTGCTGCCCACCC 

+ + ^ + n + 6720 

TGTACTCGAGGCGGTTCCACGTCGTCGTCTGGTGGCTCTTCGACTCGGACGACGGGTGGG 
MSSAKVQQQTTEKLSLLPTR- 

GCACGTCCGTCTACGAGGTCCCGTCCGTCGCGGACAACGAGATGGTGAAGTTCTTCAAGC 

+ + + + + + 6*780 

CGTGCAGGCAGATGCTCCAGGGCAGGCAGCGCCTGTTGCTCTACCACTTCAAGAAGTTCG 
TSVYEV PSVADNEMVKFFK P- 

CGGCCGTCGACAAGGCCGTCGAACGGCCGTGGATCGCCGAGGGCAATGCCCTCTTCGAGC 

^ + + h + + 6840 

GCCGGCAGCTGTTCCGGCAGCTTGCCGGCACCTAGCGGCTCCCGTTACGGGAGAAGCTCG 
AVDKAVERPWIAEGNALFEP- 

P 
s 
t 
I 

CGATCCGGCTGCAG 

+ 6854 

GCTAGGCCGACGTC 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 (i) APPLICANT: 

(A) NAME: Hoechst Aktiengesellschaft 

(B) STREET: 

(C) CITY: Frankfurt 

(D) FEDERAL STATE: - 
10 (E) COUNTRY: Germany 

(F) POSTAL CODE: 65926 

(G) TELEPHONE: 069-305-3005 

(H) TELEFAX: 069-35-7175 

(I) TELEX; - 

15 

(ii) TITLE OF APPLICATION: Isolation of the genes for 
biosynthesizing pseudo-oligosaccharides from 
Streptomyces glaucescens GLA.O and their use 

20 (iii) NUMBER OF SEQUENCES: 13 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: floppy disk 

(B) COMPUTER: IBM PC compatible 

25 (C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version 
#1 .25 (EPO) 

(2) INFORMATION FOR SEQ ID NO.: 1 : 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(ix) FEATURES: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..22 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 1 : 

CSGGSGSSGC SGGSTTCATS GG 

(2) INFORMATION FOR SEQ ID NO.: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURES: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..24 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 2: 

GGGWVCTGGY VSGGSCCGTA GTTG 

(2) INFORMATION FOR SEQ ID NO.: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 546 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURES: 

(A) NAME/KEY: exon 
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(B) LOCATION: 1..546 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 3: 

(Xi) SEQUENZBESCHREIBUNG : SEQ ID NO: 3: 



CCCGGGCGGG 


GCGGGGTTCA 


TCGGCTCCGC 


CTACGTCCGC 


CGGCTCCTGT 


CGCCCGGGGC 


60 


CCCCGGCGGC 


GTCGCGGTGA 


CCGTCCTCGA 


CAAACTCACC 


TACGCCGGCA 


GCCTCGCCCG 


120 


CCTGCACGCG 


GTGCGTGACC 


ATCCCGGCCT 


CACCTTCGTC 


CAGGGCGACG 


TGTCCGACAC 


180 


CGCGCTCGTC 


GACACGCTGG 


CCGCGCGGCA 


CGACGACATC 


GTGCACTTCG 


CGGCCGAGTC 


240 


GCACGTCGAC 


CGCTCCATCA 


CCGACAGCGG 


TGCCTTCACC 


CGCACCAACG 


TGCTGGGCAC 


300 


CCAGGTCCTG 


CTCGACGCCG 


CGCTCCGCCA 


CGGTGTGCGC 


ACCCTCGTGC 


ACGTCTCCAC 


360 


CGACGAGGTG 


TACGGCTCCC 


TCCCGCACGG 


GGCCGCCGCG 


GAGAGCGACC 


CCCTGCTCCC 


420 


GACCTCGCCG 


TACGCGGCGT 


CGAAGGCGGC 


CTCGGACCTC 


ATGGCGCTCG 


CCCACCACCG 


480 


CACCCACGGC 


CTGGACGTCC 


GGGTGACCCG 


CTGTTCGAAC 


AACTACGGCC 


CGCACCAGTT 


54 0 


CCCGGG 












546 



(2) INFORMATION FOR SEQ ID NO.: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 541 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURES: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..541 



(Xi) SEQUENZBESCHREIBUNG : SEQ ID NO: 4: 
CCCCGGGTGC TGGTAGGGGC CGTAGTTSTT GGAGCAGCGG GTGATGCGCA CGTCCAGGCC 
GTGGCTGACG TGCATGGCCA GCGCGAGCAG GTCGCCCGAC GCCTTGGAGG TGGCATAGGG 
GCTGTTGGGG CGCAGCGGCT CGTCCTCCGT CCACGACCCC GTCTCCAGCG AGCCGTAGAC 
CTCGTCGGTG GACACCTGCA CGAAGGGGGC CACGCCGTGC CGCAGGGCCG CGTCGAGGAG 
TGTCTGCGTG CCGCCGGCGT TGGTCCGCAC GAACGCGGCG GCATCGAGCA GCGAGCGGTC 
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CACGTGCGAC TCGGCGGCGA GGTGCACGAC CTGGTCCTGG CCGGCCATGA CCCGGTCGAC 
CAGGTCCGCG TCGCAGATGT CGCCGTGGAC GAAGCGCAGC CGGGGGTGGT CGCGGACCGG 
CTCGAGGTTG GCGAGGTTGC CG6CGTAGCT CAGGGCC3TCG AGCACGGTGA CGACGGCGTC 
GOSCGGCCCG TCCGGACCGA GGAGGQTGCG GACGTAGTGC GAGCCCATGA ACCCCGCCGC 
C 

(2) INFORMATION FOR SEQ ID NO.: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



360 
420 
480 
540 
541 



(ii) MOLECULE TYPE: protein 

(ix) FEATURES: 

(A) NAME/KEY: PCRstrE.Pep 

(B) LOCATION: 1..180 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 5: 
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Ala Ala Gly Phe Met Gly Ser His Tyr Val Arg Thr Leu Leu Gly Pro 
15 10 15 

Asp Gly Pro Pro Asp Ala Val Val Thr Val Leu Asp Ala Leu Ser Tyr 
20 25 30 

Ala Gly Asn Leu Ala Asn Leu Asp Pro Val Arg Asp His Pro Arg Leu 
35 40 45 

Arg Phe Val His Gly Asp lie Cys Asp Ala Asp Leu Val Asp Arg Val 
50 55 60 

Met Ala Gly Gin Asp Gin Val Val His Leu Ala Ala Glu Ser His Val 
65 70 75 80 

Asp Arg Ser Leu Leu Asp Ala Ala Ala Phe Val Arg Thr Asn Ala Gly 
B5 90 95 

Gly Thr Gin Thr Leu X^eu Asp Ala Ala Leu Arg His Gly Val Ala Pro 
100 105 110 

Phe Val Gin Val Ser Thr Asp Glu Val Tyr Gly Ser Leu Glu Thr Gly 
115 120 125 

Ser Trp Thr Glu Asp Glu Pro Leu Arg Pro Asn Ser Pro Tyr Ala Thr 
130 135 140 

Ser Lys Ala Ser Gly Asp Leu Leu Ala Leu Ala Met His Val Ser His 
145 150 155 160 

Gly Leu Asp Val Arg lie Thr Arg Cys Ser Asn Asn Tyr Gly Pro Tyr 
165 170 175 

Gin His Pro Gly 



180 



(2) 



INFORMATION FOR SEQ ID NO.: 6: 



5 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 amino acids 

(B) TYPE: amino acid 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



(ii) 



MOLECULE TYPE: protein 



(ix) FEATURES: 



(A) NAME/KEY: PCR acbD.Pep 

(B) LOCATION: 1..181 



15 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 6: 
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Pro Gly Gly Ala Gly Phe He Gly Ser Ala Tyr Val Arg Arg Leu Leu 
15 10 15 

Ser Pro Gly Ala Pro Gly Gly Val Ala Val Thr Val Leu Asp Lys Leu 
20 25 30 

Thr Tyr Ala Gly Ser Leu Ala Arg Leu His Ala Val Arg Asp His Pro 
35 40 45 

Gly Leu Thr Phe Val Gin Gly Asp Val Cys Asp Thr Ala Leu val Asp 
50 55 60 

Thr Leu Ala Ala Arg His Asp Asp He Val His Phe Ala Ala Glu Ser 
65 70 75 80 

His Val Asp Arg Ser lie Thr Asp Ser Gly Ala Phe Thr Arg Thr Asn 

8S 90 9S 

Val Leu Gly Thr Gin Val Leu Leu Asp Ala Ala Leu Arg His Gly Val 
100 105 HO 

Arg Thr Leu Val His Val Ser Thr Asp Glu Val Tyr Gly Ser Leu Pro 
115 120 125 

His Gly Ala Ala Ala Glu Ser Asp Pro Leu Leu Pro Thr Ser Pro Tyr 
130 135 140 

Ala Ala Ser Lys Ala Ala Ser Asp Leu Met Ala Leu Ala His His Arg 
145 150 155 160 

Thr His Gly Leu Asp Val Arg Val Thr Arg Cys Ser Asn Asn Tyr Gly 
165 170 175 

Pro Hxs Gin Phe Pro 



180 



(2) 



INFORMATION FOR SEQ ID NO.: 7: 



5 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6854 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



10 



MOLECULE TYPE: DNA (genomic) 



(ix) 



FEATURES: 



(A) NAME/KEY: "acarbose" biosynthesis gene cluster 

(B) LOCATION: 1..6854 



15 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO.: 7: 
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CTGCAGG6TT 


CCCTGGTGCA 


CGACCCGCCC 


CTGGTCGACG 


ACCAGGGCGC 


TGTCGCAGAT 


60 


CGCGGCGATG 


TCGGCGATGT 


CGTGGCTGGT 


GAGCACCACG 


GTGGTGCCCA 


GTTCCCGGTG 


120 


GGCGCGGTTG 


ACCAGCCGGC 


GCACCGCGTC 


CTTCAGCACC 


ATGTCGAGGC 


CGATCGTGGG 


180 


CTCGTCCCAG 


AACAGCACGG 


CCGGGTCGTG 


CAGCAGGCTC 


GCCGCGATCT 


CGGCGCGCAT 


240 


GCGCTGTCCG 


AGGCTGAGCT 


GCCGCACGGG 


GGTGGACCCC 


AGCGCGTCGA 


TGTCGAGGAG 


300 


GTCCCGGAAC 


AGGGCGAGGT 


TGCGCCGGTA GACCGGTCCG 


GGGATGTCGT 


AGATGCGGCG 


360 


CAGGATGCGG 


AAGGAGTCGG 


GTACCGACAG 


GTCCCACCAG 


AGCTGGCTGC 


GCTGGCCGAA 


420 


GACGACGCCG 


ATCGTGCGGG 


CGTTGCGCTG 


CCGGTGCCGG 


TAGGGCTCCA 


GCCCGGCGAC 


460 


CGTGCAGCGG 


CCGGAGGTGG 


GGGTCATGAT 


GCCGGTCAGC 


ATCTTGATCG 


TGGTCGACTT 


540 


GCCGGCTCCG 


TTGGCGCCGA 


TGTAGGCGGT 


CTTCGTGCCG 


GC CGGTATCT 


CGAAGGAGAC 


600 


GTCGTCGACG 


GCGCGCACGA 


CGCGGTACCG 


GCGGGTCAGG 


AGGGTGGAGA 


GGCTGCCGAG 


660 


CAGGCCGGGC 


TCGCGTTCGG 


CCAGCCGGAA 


CTCCTTGACG 


AGGTGTTCGG 


CCACGATCAC 


720 


GCGATCACCC 


GCTCGACGGC 


CGTCTCCAGC 


AGGCGCAGGC 


CCTCGTCGAG 


CAGCGCCTCG 


780 


TCGAGGGTGA 


ACGGCGGTGC 


CAGCCGCAGG 


ATGTGGCCGC 


CCAGGGAGGT 


GCGCAGCCCC 


840 


AGGTCGAGGG 


CGGTGGTGTA 


GACGGCCCGG 


GCGGTCTCGG 


GGGCGGGTGC 


CCGGCCGACG 


900 


GCGTCGGTGA 


CGAACTCCAG 


GCCCCACAGC 


AGTCCGAGGC 


CGCGTACCTG 


GCCGAGCTGG 


960 


GGGAAGCGGG 


ACTCCAGGGC 


GCGCAGCCGC 


TCCTGGATGA 


GCTCGCCGAG 


GACGCGCACG 


1020 


CGGTCGATCA 


GCCGGTCGCG 


CTCGACGACC 


TCCAGCGTGG 


CGCGGGCGGC 


GGCGATCCCC 


1080 


AGTGGGTTGC 


TCGCGTACGT 


CGAGGCGTAC 


GCCCCGGGGT 


GGCCGCCTCC 


GGCCTGCGCA 


1140 


GCTTCCGCGC 


GTCCGGCCAG 


CACGGCGAAG 


GGGAATCCGC 


TCGCGGTGCC 


CTTGGACAGC 


1200 


ATCGCCAGGT 


CCGGCTCGAT 


GCCGAACAGT 


TCGCTGGCGA 


GGAAGGCGCC 


GGTGCGCCCG 


1260 


CCGCCGGTGA 


GGACCTCGTC 


GGCGACGAGC 


AGCACGCCGC 


CGTCCCGGCA 


GGCGCCGGCG 


1320 


ATCCGCTCCC 


AGTAG CCGGG 


GGGC GGCACG 


ATGACGCCTG 


CCGCGCCGAG 


GACGGGTTCG 


1380 


AAGACCAGGG 


CCGAGACGTT 


GGGCTTCTCC 


GCGATGTGCC 


GGCGCACGAG 


GGTCGCGCAC 


1440 


CGCACGTCGC 


ACGAGGGGTA 


CTCCAGGCCC 


AGGGGACAGC 


GGTAGCCAGT 


AGGGGCTGTA 


1500 


GCCAGCACGC 


TGTTGCCGCT 


GAAGGCCTGG 


TGGCCGATGT 


CCCAGaGGAC 


CAGCATCCGG 


1560 


GCGCCCATGG 


TCTTGCCGTG 


GAAGCCGTGG 


CGCAGGGCGC 


AGATCCGGTT 


GCGGCCCGGC 


1620 


GCGGCGGTCG 


CCTGGACGAC 


CCGCAGGGCG 


GCCTCGACCA 


CCTCCGCGCC 


GGTGGAGAAG 


1680 


AAGGCGTAGG 


TGTCGAGCTG 


TTCGGGCAGC 


AGCCTGGCGA 


GCAGTTCCAG 


CAGGCCGGCG 


1740 


CGGTCCGGCG 


TGGCGCTGTC 


GTGGACGTTC 


CACAGGCGGC 


GGGCCTGGGT 


GGTGAGTGCC 


1800 
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TCGACGACCT 


' CCGGGTGCCC 


1 GTGGCCCAGT GACTGGGTGA GGGTCCCGGC CGCGAAGTCG 


I860 


AGGTACTGGT 


TGCCGTCCAG 


GTCGGTCAGA ACGGGACCGC GTCCCTCGGC GAAGACCCGG 


1920 


CGTCCGTGGA 


CGGCTTCCTC 


GGAGGCGCCC 


GGCGCCAGGT CGCGGGCCTC 


: CCGTGCCAGG 


1980 


TGCTGTGTCT 


GCCGTAAGCC 


TGTCATCGCT 


GCCTCTGCTC 


GTCGGACCGG 


CTGACGCGAT 


2040 


CGCCGGCGAA 


CTGCGTTGTG 


GCGCACCACG 


GTTGGGGCGG 


CTCGGCGCTG 


AGTCAAACAC 


2100 


TTGAACACAC 


ACCGCTGCAA 


GAGTTTGCGG 


GTTGTTTCAG 


AAAGTTGTTG 


CGAGCGGCCC 


2160 


CGGCACTCTG 


GTTGAGTCGA 


CGTGCTTACG 


GCGCCACCAC 


GCCTCACGTT 


CGAGGAGGGA 


2220 


CCTGTGAGAA 


CAAGCCCGCA 


GACCGACCCG 


CTCCCGCGGA 


GGCCGAGGTG 


AAGGCCCTGG 


2280 


TCCTGGCAGG 


TGGAACCGGC 


AGCAGACTGA 


GGCCGTTCAC 


CCACACCGCC 


GCCAAGCAGC 


2340 


TGCTCCCCAT 


CGCCAACAAG 


CCCGTGCTCT 


TCTACGCGCT 


GGAGTCCCTC 


GCCGCGGCGG 


2400 


GTGTCCGGGA 


GGCCGGCGTC 


GTCGTGGGCG 


CGTACGGCCG 


GGAGATCCGC 


GAACTCACCG 


2460 


GCGACGGCAC 


CGCGTTCGGG 


TTACGCATCA 


CCTACCTCCA 


CCAGCCCCGC 


CCGCTCGGTC 


2520 


TCGCGCACGC 


GGTGCGCATC 


GCCCGCGGCT 


TCCTGGGCGA 


CGACGACTTC 


CTGCTGTACC 


2580 


TGGGGGACAA 


CTACCTGCCC 


CAGGGCGTCA 


CCGACTTCGC 


CCGCCAATCG 


GCCGCCGATC 


2640 


CCGCGGCGGC 


CCGGCTGCTG 


CTCACCCCGG 


TCGCGGACCC 


GTCCGCCTTC 


GGCGTCGCGG 


2700 


AGGTCGACGC 


GGACGGGAAC 


GTGCTGCGCT 


TGGAGGAGAA 


ACCCGACGTC 


CCGCGCAGCT 


2760 


CGCTCGCGCT 


CATCGGCGTG 


TACGCCTTCA 


GCCCGGCCGT 


CCACGAGGCG 


GTACGGGCCA 


2820 


TCACCCCCTC 


CGCCCGCGGC 


GAGCTGGAGA 


TCACCCACGC 


CGTGCAGTGG 


ATGATCGACC 


2880 


GGGGCCTGCG 


CGTACGGGCC 


GAGACCACCA 


CCCGGCCCTG 


GCGCGACACC 


GGCAGCGCGG 


2940 


AGGACATGCT 


GGAGGTCAAC 


CGTCACGTCC 


TGGACGGACT 


GGAGGGCCGC 


ATCGAGGGGA 


3000 


AGGTCGACGC 


GCACAGCACG 


CTGGTCGGCC 


GGGTCCGGGT 


GGCCGAAGGC 


GCGATCGTGC 


3060 


GGGGGTCACA 


CGTGGTGGGC 


CCGGTGGTGA 


TCGGCGCGGG 


TGCCGTCGTC 


AG CAACTCCA 


3120 


GTGTCGGCCC 


GTACACCTCC 


ATCGGGGAGG 


ACTGCCGGGT 


CGAGGACAGC 


GCCATCGAGT 


3180 


ACTCCGTCCT 


GCTGCGCGGC 


GCCCAGGTCG 


AGGGGGCGTC 


CCGCATCGAG 


GCGTCCCTCA 


3240 


TCGGCCGCGG 


CGCCGTCGTC 


GGCCCGGCCC 


CCCGTCTCCC 


GCAGGCTCAC 


CGACTGGTGA 


3300 


TCGGCGACCA 


CAG CAAGGTG 


TATCTCACCC 


CATGACCACG 


ACCATCCTCG 


TCACCGGCGG 


3360 


AGCGGGCTTC 


ATTCGCTCCG 


CCTACGTCCG 


CCGGCTCCTG 


TCG CCCGGGG 


CCCCCGGCGG 


3420 


CGTCGCGGTG 


ACCGTCCTCG 


ACAAACTCAC 


CTACGCCGGC 


AGCCTCGCCC 


GCCTGCACGC 


3480 


GGTGCGTGAC 


CATCCCGGCC 


TCACCTTCGT 


CCAGGGCGAC 


GTGTGCGACA 


CCGCGCTCGT 


3540 


CGACACGCTG 


GCCGCGCGGC 


ACGACGACAT 


CGTGCACTTC 


GCGGCCGAGT 


CGCACGTCGA 


3600 


CCGCTCCATC 


ACCGACAGCG 


GTGCCTTCAC 


CCGCACCAAC 


GTGCTGGGCA 


CCCAGGTCCT 


3660 


GCTCGACGCC 


GCGCTCCGCC 


ACGGTGTGCG 


CACCTTCGTG 


CACGTCTCCA 


CCGACGAGGT 


3720 


GTACGGCTCC 


CTCCCGCACG 


GGGCCGCCGC 


GGAGAGCGAC 


CCCCTGCTTC 


CGACCTCGCC 


3780 


GTACGCGGCG 


TCGAAGGCGG 


CCTCGGACCT 


CATGGCGCTC 


GCCCACCACC 


GCACCCACGG 


3840 
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CCTGGACGTC 


CGGGTGACCC 


GCTGTTCGAA 


CAACTTCGGC 


CCCCXCCZGC 


ATCCCGAGAA 


3900 


GCTCATACCG 


CGCTTCCTGA CCAGCCTCCT GTCCGGCGGC ACCGTTCCCC 


TCTACGGCGA 


3960 


CGGGCGGCAC 


GTGCGCGACT 


GGCTGCACGT 


CGACGACCAC 


GTCAGGGCCG 


TCGAACTCGT 


4020 


CCGCGTGTCG 


GGCCGGCCGG 


GAGAGATCTA 


CAACATCGGG 


GGCGGCACCT 


CGCTGCCCAA 


4080 


CCTGGAGCTC 


ACGCACCGGT 


TGCTCGCACT 


GTGCGGCGCG 


GGCCCGGAGC 


GCATCGTCCA 


4X40 


CGTCGAGAAC 


CGCAAGGGGC 


ACGACCGGCG 


CTACGCGGTC 


GACCACAGCA AGATCACCGC 


4200 


GGAACTCGGT 


TACCGGCCGC 


GCACCGACTT 


CGCGACCGCG 


CTGGCCGACA 


CCGCGAAGTG 


4260 


GTACGAGCGG 


CACGAGGACT 


GGTGGCGTCC 


CCTGCTCGCC 


GCGACATGAC 


GTCGGGCCGG 


4320 


ACCGCAACCA 


CCGGCCCCGG 


CCGGCACACC 


GCCGCCCGCG 


GCCGGTGGCC 


GGCCGGTCAG 


4360 


CGTCCGTGAG 


CCGGGCGCCG 


GCCGCCCCGC 


GGGCCGGCGG 


CGGTGGACCC 


CCGGACCACC 


4440 


AGTTCCGGCA 


TGAAGACGAA 


TTCGGTGCGC 


GGCGGCGGCG 


TTCCGCTCAT 


CTCCTCCAGC 


4500 


AGTGCGTCCA 


CGGCGACCTG 


CCCCATCGCC 


TTGACGGGCT 


GTCTGATGGT 


GGTCAGGGGA 


4560 


GGGTCGGTGA 


AGGCCATGAG 


CGGCGAGTCG 


TCGAAGCCGA 


CCACCGAGAT 


GTCACCGGGA 


4620 


ACCGTGAGAC 


CCCGCCGGCG 


CGCGGCCCGC 


ACGGCGCCGA 


GGGCCATCAT 


GTCGCTGGCG 


4680 


CACATGACGG 


CGGTGCAGCC 


CAGGTCGATC 


AGCGCGGACG 


CGGCGGCCTG 


GCCCCCCTCC 


4740 


AGGGAGAACA 


GCGAGTGCTG 


CACGAGCTCC 


TCGGACTCCC 


GCGCCGACAC 


TCCCAGGTGC 


4800 


TCCCGCACGC 


CGGCCCGGAA 


CCCCTCGATC 


TTCCGCTGCA 


CCGGCACGAA 


GCGGGCGGGC 


4860 


CCGACGGCGA 


GGCCGACGCG 


CTCGTGCCCC 


AGCTCCGCCA 


GGTGCGCCAC 


GGCCAGGCGC 


4920 


ATCGCGGCCC 


GGTCGTCCGG 


GGAGACGAAG 


GGTGCCTCGA 


TCCGGGGCGA 


GAACCCGTTC 


4980 


ACGAGGACGA 


AGGGCACCTG 


CCGCTCGTGC 


AGCCGGCCGT 


ACCGTCCGGT 


CTCGGCGGTG 


504 0 


GTGTCCGCGT 


GCAGTCCGGA 


GACGAAGATG 


ATGCCGGACA 


CCCCGCGGTC 


CACGAGCATC 


5100 


TCCGTGAGTT 


CGTCCTCGGT 


CGAGCCGCCC 


GGGGTCTGCG 


TGGCGAGCAC 


GGGCGTGTAG 


5160 


CCCTGACGCG 


TGAGCGCCTG 


CCCCATCACC 


TGGGCCAGTG 


CGGGGAAGAA 


GGGGTTGTCC 


5220 


AGTTCGGGGG 


TGACCAGTCC 


GACCAGCTCG 


GCGCGGCGCT 


GTCGCGCCGG 


CTGCTCGTAG 


5280 


CCCAGCGCGT 


CCAGTGCGGT 


CAGCACCGAG 


TCGCGGGTGC 


CGGTGGCCAC 


ACCGCGCGCA 


S340 


CCGTTCAGCA 


CCCGGCTGAC 


CGTGGCCTTG 


CTGACGCCCG 


CCCGGGCTGC 


GATGTCGGCG 


5400 


AGCCGCATGG 


TCATGGCAAC 


GCACTCTACC 


TGTCGGGGCG 


TCAGGGCGTG 


CCCACCGCGC 


5460 


GCGGAACCGG 


CGGACTGCGG 


GGCACGGCCC 


GTCCGCCGCC 


CACGGACCAC 


GCGCCCGAAA 


5520 


CGATGG CTGA 


AAATGCTTG C 


AGCAAATTGC 


CGCAACGTCT 


TTCGGCGGCT 


TTTCGATCCT 


5580 


GTTACGTTCC 


TGGCAACCCC 


GGCGCCGCGC 


AGAAGCGGTT 


GGCGTGAGGC 


GTCCAGACCT 


5640 


CCGCCCGATT 


CCGGGATCAC 


TCAGGGGAGT 


TCACAATGCG 


GCGTGGCATT 


GCGGCCACCG 


5700 


CGCTGTTCGC 


GGCTGTGGCC 


ATGACGG CAT CGGCGTGTGG 


CGGGGGCGAC AACGGCGGAA 


5760 


GCGGTACCGA CGCGGGCGGC ACGGAGCTGT CGGGGACCGT CACCTT CTGG GACACGTCCA 


5820 


ACGAAGCCGA GAAGGCGACG TACCAGGCCC TCGCGGAGGG CTTCGAGAAG GAGCACCCGA 


58B0 
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AGGTCGACGT 


CAAGTACGTC 


AACGTCCCGtT 


TCGGCGAGGC 


GAACGCCAAG 


TTCAAGAACG 


5940 


CCGCGGGCGG 


CAACTCCGCT 


GCCCCGGACG 


TGATG CGTAC 


GGAGGTCGCC 


TGGGTCGCGG 


6000 


ACTTCCCCAG 


CATCGGCTAC 


CTCGCCCCGC 


TCGACGGCAC 


GCCCGCCCTC 


GACGACGGGT 


6060 


CGGACCACCT 


TCCCCAGGGC 


GGCAGCACCA 


GGTACGAGGG GAAGACCTAC GCGGTCCCGC 


6120 


AGGTGATCGA 


CACCCTGGCG 


CTCTTCTAGA 


ACAAGGAACT 


GCTGACGAAG 


GCCGGTGTCG 


6180 


AGGTGCCGGG 


CTCCCTCGCC 


GAGCTGAAGA 


CGGCCGCCGC 


CGAGATCACC 


GAGAAGACCG 


6240 


GCGCGAGCGG 


CCTCTACTGC 


GGGGCGACGA 


CCCGTACTTG 


GTTCCTGCCC 


TACCTCTACG 


6300 


GGGAGGGCGG 


CGACCTGGTC 


GACGAGAAGA 


ACAAGACCGT 


CACGGTCGAC 


GACGAAGCCG 


6360 


GTGTGCGCGC 


CTACCGCGTC 


ATCAAGGACC 


TCGTGGACJIG 


CAAGGCGGCC ATCACCGACG 


6420 


CGTCCGACGG 


CTGGAACAAC 


ATGCAGAACG 


CCTTCAAGTC 


GGGCAAGGTC 


GCCATGATGG 


6480 


TCAACGGCCC 


CTGGGCCATC 


GAGGACGTCA 


AGGCGGGAGC 


CCGCTTCAAG 


GACGCCGGCA 


6540 


ACCTGGGGGT 


CGCCCCCGTC 


CCGGCCGGCA 


GTGCCGGACA 


GGGCTCTCCC 


CAGGGCGGGT 


6600 


GGAACCTCTC 


GGTGTACGCG 


GGCTCGAAGA ACCTCGACGC 


CTCCTACGCC 


TTCGTGAAGT 


6660 


ACATGAGCTC 


CGCCAAGGTG 


CAGCAGCAGA CCACCGAGAA GCTGAGCCTG 


CTGCCCACCC 


6720 


GCACGTCCGT 


CTACGAGGTC 


CCGTCCGTCG 


CGGACAACGA 


GATGGTGAAG 


TTCTTCAAGC 


6780 


CGGCCGTCGA 


CAAGGCCGTC 


GAACGGCCGT 


GGATCGCCGA 


GGGCAATGCC 


CTCTTCGAGC 


6840 


CGATCCGGCT 


GCAG 










6854 



(2) INFORMATION FOR SEQ ID NO.: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(ix) FEATURES: 

(A) NAME/KEY: acbA 

(B) LOCATION: 1..240 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 8: 
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Val lie Val Ala Glu His Leu Val Lys Glu Phe Arg Leu Ala Glu Arg 
15 10 15 

Glu Pro Gly Leu Leu Gly Ser Leu Ser Thr Leu Leu Thr Arg Arg Tyr 
20 25 30 

Arg Val Val Arg Ala Val Asp Asp Val Ser Pile Glu lie Pro Ala Gly 
35 40 45 

Thr Lys Thr Ala Tyr lie Gly Ala Asn Gly Ala Gly Lys Ser Thr Thr 
50 55 SO 

lie Lys Met Leu Thr Gly He Met Thr Pro Thr Ser Gly Arg Cys Thr 
65 70 75 80 

Val Ala Gly Leu Glu Pro Tyr Arg His Arg Gin Arg Asn Ala Arg Thr 
85 90 95 

He Gly Val Val Phe Gly Gin Arg Ser Gin Leu Trp Trp Asp Leu Ser 
100 105 110 

Val Pro Asp Ser Phe Arg He Leu Arg Arg He Tyr Asp He Pro Gly 
115 120 125 

Pro Val Tyr Arg Arg Asn Leu Ala Lieu Phe Arg Asp Leu Leu Asp He 
130 135 140 

Asp Ala Leu Gly Ser Thr Pro Val Arg Gin Leu Ser Leu Gly Gin Arg 
145 150 155 160 

Met Arg Ala Glu He Ala Ala Ser Leu Leu His Asp Pro Ala Val Leu 
165 170 175 

Phe Trp Asp Glu Pro Tiir He Gly Leu Asp Met Val Lreu Lys Asp Ala 
180 185 ISO 

Val Arg Arg Leu Val Asn Arg Ala His Arg Glu Leu Gly Thr Thr Val 
195 200 205 

Val Leu Thr Ser His Asp He Ala Asp He Ala Ala He Cys Asp Ser 
210 215 220 

Ala Leu Val Val Asp Gin Gly Arg Val Val His Gin Gly Thr Leu Gin 
225 230 235 240 



(2) INFORMATION FOR SEQ ID NO.: 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(ix) FEATURES: 

(A) NAME/KEY: acbB 

(B) LOCATION: 1..429 
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(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO.: 9: 



Met Thr Gly Leu Arg Gin Thr Gin His Leu Ala Arg Glu Ala Arg His 
15 10 15 

Leu Ala Pro Gly Ala Ser Glu Glu Ala Val His Gly Arg Arg Val Phe 
20 25 30 

Ala Glu Gly Arg Gly Pro Val Leu Thr Asp Leu Asp Gly Asn Gin Tyr 
35 40 45 

Leu Asp Phe Ala Ala Gly Thr Leu Thr Gin Ser Leu Gly His Gly His 
50 55 60 

Pro Glu Val Val Glu Ala Leu Thr Thr Gin Ala Arg Arg Leu Trp Asn 
65 70 75 80 

Val His Asp Ser Ala Thr Pro Asp Arg Ala Gly Leu Leu Glu Leu Leu 



B5 



90 



95 
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Ala Arg Leu Leu Pro Glu Gin Leu Asp Thr Tyr Ala Phe Phe Ser Thr 
100 105 no 

Gly Ala Glu Val Val Glu Ala Ala Leu Arg Val Val Gin Ala Thr Ala 
115 120 125 

Ala Pro Gly Arg Asn Arg He Cys Ala Leu Arg His Gly Phe His Gly 
130 135 HO 

Lys Thr Met Gly Ala Arg Met Leu Val His Trp Asp lie Gly His Gin 
145 150 155 160 

Ala Phe Ser Gly Asn Ser Val Leu Ala Thr Ala Pro Thr Gly Tyr Arg 
165 170 175 

Cys Pro Leu Gly Leu Glu Tyr Pro Ser Cys Asp Val Arg Cys Ala Thr 
180 185 150 

Leu val Arg Arg His lie Ala Glu Lys Pro Asn Val Ser Ala Leu Val 
155 200 205 

Phe Glu Pro Val Leu Gly Ala Ala Gly Val lie Val Pro Pro Pro Gly 
210 215 220 

Tyr Trp Glu Arg lie Ala Gly Ala Cys Arg Asp Gly Gly Val Leu Leu 
225 230 235 240 

Val Ala Asp Glu Val Leu Thr Gly Gly Gly Arg Thr Gly Ala Phe Leu 
245 250 255 

Ala Ser Glu Leu Phe Gly lie Glu Pro Asp Leu Ala Met Leu Ser Lys 
260 265 270 

Gly Thr Ala Ser Gly Phe Pro Phe Ala Val Leu Ala Gly Arg Ala Glu 
275 280 285 

Ala Ala Gin Ala Gly Gly Gly Hie Pro Gly Ala Tyr Ala Ser Thr Tyr 
290 295 300 

Ala Ser Asn Pro Leu Gly lie Ala Ala Ala Arg Ala Thr Leu Glu Val 
305 310 315 320 

Val Glu Arg Asp Arg Leu lie Asp Arg Val Arg Val Leu Gly Glu Leu 
325 330 335 

He Gin Glu Arg Leu Arg Ala Leu Glu Ser Arg Phe Pro Gin Leu Gly 
340 345 350 

Gin Val Arg Gly Leu Gly Leu Leu Trp Gly Leu Glu Phe Val Thr Asp 
355 360 365 

Ala Val Gly Arg Ala Pro Ala Pro Glu Thr Ala Arg Ala Val Tyr Thr 
370 375 380 

Thr Ala Leu Asp Leu Gly Leu Arg Thr Ser Leu Gly Gly His He Leu 
3B5 390 395 400 

Arg Leu Ala Pro Pro Phe Thr Leu Asp Glu Ala Leu Leu Asp Glu Gly 
405 410 415 

Leu Arg Leu Leu Glu Thr Ala Val Glu Arg Val He Ala 
420 425 



(2) INFORMATION FOR SEQ ID NO.: 10: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: protein 

(ix) FEATURES: 

(A) NAME/KEY: acbC 
10 (B) LOCATION: 1 ..355 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 10: 
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Val Lys Ala Leu Val Leu Ala Gly Gly Thr Gly Ser Arg Leu Arg Pro 
1 5 10 15 

Phe Thr His Thr Ala Ala Lys Gin Leu Leu Pro lie Ala Asia Lys Fro 
20 25 30 

Val Leu Phe Tyr Ala Leu Glu Ser Leu Ala Ala Ala Gly Val Arg Glu 
35 40 45 

Ala Gly Val Val Val Gly Ala Tyr Gly Arg Glu lie Arg Glu Leu Thr 
50 55 60 

Gly Asp Gly Thr Ala Phe Gly Leu Arg lie Thr Tyr Leu His Gin Pro 
65 70 75 80 

Arg Pro Leu Gly Leu Ala His Ala Val Arg lie Ala Arg Gly Phe Leu 
es 90 95 

Gly Asp Asp Asp Phe Leu Leu Tyr Leu Gly Asp Asn Tyr Leu Pro Gin 
100 105 110 

Gly Val Thr Asp Phe Ala Arg Gin Ser Ala Ala Asp Pro Ala Ala Ala 
115 120 125 

Arg Leu Leu Leu Thr Pro Val Ala Asp Pro Ser Ala Phe Gly Val Ala 
130 135 140 

Glu Val Asp Ala Asp Gly Asn Val Leu Arg Leu Glu Glu Lys Pro Asp 
145 150 155 160 

val Pro Arg Ser Ser Leu Ala Leu lie Gly Val Tyr Ala Phe Ser Pro 
165 170 175 

Ala Val His Glu Ala Val Arg Ala lie Thr Pro Ser Ala Arg Gly Glu 
180 185 190 

Leu Glu lie Thr His Ala Val Gin Trp Met He Asp Arg Gly Leu Arg 
195 200 205 

Val Arg Ala Glu Thr Thr Thr Arg Pro Trp Arg Asp Thr Gly Ser Ala 
210 215 220 

Glu Asp Met Leu Glu Val Asn Arg His Val Leu Asp Gly Leu Glu Gly 
225 230 235 240 

Arg He Glu Gly Lys Val Asp Ala His Ser Thr Leu Val Gly Arg Val 
245 250 255 

Arg Val Ala Glu Gly Ala He Val Arg Gly Ser His Val Val Gly Pro 
260 265 270 

Val Val He Gly Ala Gly Ala Val Val Ser Asn Ser Ser Val Gly Pro 
275 280 285 

Tyr Thr Ser He Gly Glu Asp Cys Arg Val Glu Asp Ser Ala He Glu 
290 295 300 

Tyr Ser Val Leu Leu Arg Gly Ala Gin Val Glu Gly Ala Ser Arg He 
305 310 315 320 

Glu Ala Ser Leu He Gly Arg Gly Ala Val Val Gly Pro Ala Pro Arg 
325 330 335 

Leu Pro Gin Ala His Arg Leu Val He Gly Asp His Ser Lys Val Tyr 
340 345 350 

Leu Thr Pro 
355 
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10 



(2) INFORMATION FOR SEQ ID NO.: 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 325 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(ix) FEATURES: 

(A) NAME/KEY: acbD 

(B) LOCATION: 1..325 

1 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 1 1 : 

Met Thr Thr Thr lie Leu Val Thr Gly Gly Ala Gly Phe He Arg Sex 
15 io 15 

Ala Tyr Val Arg Arg Leu Leu Ser Pro Gly Ala Pro Gly Gly Val Ala 
20 25 30 

Val Thr Val Leu Asp Lys Leu Thr Tyr Ala Gly Ser Leu Ala Arg Leu 
35 40 45 

His Ala Val Arg Asp His Pro Gly Leu Thr Phe Val Gin Gly Asp Val 
50 55 so 

Cys Asp Thr Ala Leu Val Asp Thr Leu Ala Ala Arg His Asp Asp He 
65 70 75 80 

Val His Phe Ala Ala Glu Ser His Val Asp Arg Ser He Thr Asp Ser 
85 90 95 

Gly Ala Phe Thr Arg Thr Asn Val Leu Gly Thr Gin Val Leu Leu Asp 
100 105 HO 

Ala Ala Leu Arg His Gly Val Arg Thr Phe Val His Val Ser Thr Asp 
U5 120 125 

Glu Val Tyr Gly Ser Leu Pro His Gly Ala Ala Ala Glu Ser Asp Pro 
130 135 140 
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Leu Leu Pro Thr Ser Pro Tyr Ala Ala Ser Lys Ala Ala Ser Asp Leu 
145 150 155 160 

Met Ala Leu Ala His His Arg Thr His Gly Leu Asp Val Arg Val Thr 
165 170 175 

Arg Cys Ser Asn Asia Phe Gly Pro His Gin His Pro Glu Lys Leu lie 



Gly Asp Gly Arg His Val Arg Asp Trp Leu His Val Asp Asp His Val 
210 215 220 

Arg Ala Val Glu Leu Val Arg Val Ser Gly Arg Pro Gly Glu lie Tyr 
225 230 235 240 

Asn lie Gly Gly Gly Thr Ser Leu Pro Asn Leu Glu Leu Thr His Arg 

245 250 255 

Leu Leu Ala Leu Cys Gly Ala Gly Pro Glu Arg lie Val His Val Glu 
260 265 270 

Asn Arg Lys Gly His Asp Arg Arg Tyr Ala Val Asp His Ser Lys lie 
275 280 285 

Thr Ala Glu Leu Gly Tyr Arg Pro Arg Thr Asp Phe Ala Thr Ala Leu 

290 295 300 

Ala Asp Thr Ala Lys Trp Tyr Glu Arg His Glu Asp Trp Trp Arg Pro 
305 310 315 320 

Leu Leu Ala Ala Thr 



180 



185 



190 




325 



(2) 



INFORMATION FOR SEQ ID NO.: 12: 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 345 amino acids 

(B) TYPE: amino acid 



5 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



(ii) 



MOLECULE TYPE: protein 



(ix) 



FEATURES: 



(A) NAME/KEY: acbE 

(B) LOCATION: 1..345 



15 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO.: 12: 
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Met Thr Met Arg Leu Ala Asp lie Ala Ala Arg Ala Gly Val Ser Lys 
15 10 15 

Ala Thr Val Ser Arg Val Leu Asn Gly Ala Arg Gly Val Ala Thr Gly 
20 25 30 

Thr Arg Asp Ser Val Leu Thr Ala Leu Asp Ala Leu Gly Tyr Glu Gin 
35 40 45 

Pro Ala Arg Gin Arg Arg Ala Glu Leu Val Gly Leu Val Thr Pro Glu 
50 55 60 

Leu Asp Asn Pro Phe Pfae Pro Ala Leu Ala Gin Val Met Gly Gin Ala 
55 70 75 80 

Leu Thr Arg Gin Gly Tyr Thr Pro Val Leu Ala Thr Gin Thr Pro Gly 
85 90 95 

Gly Ser Thr Glu Asp Glu Leu Thr Glu Met Leu Val Asp Arg Gly Val 
100 105 110 

Ser Gly lie He Phe Val Ser Gly Leu His Ala Asp Thr Thr Ala Glu 
115 120 125 

Thr Gly Arg Tyr Gly Arg Leu His Glu Arg Gin Val Pro Phe Val Leu 
130 135 140 

Val Asn Gly Phe Ser Pro Arg He Glu Ala Pro Phe Val Ser Pro Asp 
145 150 155 160 

Asp Arg Ala Ala Met Arg Leu Ala Val Ala His Leu Ala Glu Leu Gly 
165 170 175 

His Glu Arg Val Gly Leu Ala Val Gly Pro Ala Arg Phe Val Pro Val 
180 185 190 

Gin Arg Lys He Glu Gly Phe Arg Ala Gly Val Arg Glu His Leu Gly 
195 200 205 

Val Ser Ala Arg Glu Ser Glu Glu Leu Val Gin His Ser Leu Phe Ser 
210 215 220 

Leu Glu Gly Gly Gin Ala Ala Ala Ser Ala Leu He Asp Leu Gly Cys 
225 230 235 240 

Thr Ala Val Met Cys Ala Ser Asp Met Met Ala Leu Gly Ala Val Arg 
245 250 255 

Ala Ala Arg Arg Arg Gly Leu Thr Val Pro Gly Asp He Ser Val Val 
260 265 270 

Gly Phe Asp Asp Ser Pro Leu Met Ala Phe Thr Asp Pro Pro Leu Thr 
275 280 2B5 

Thr He Arg Gin Pro Val Lys Ala Met Gly Gin Val Ala Val Asp Ala 
290 295 300 

Leu Leu Glu Glu Met Ser Gly Thr Pro Pro Pro Arg Thr Glu Phe Val 
305 310 315 320 

Phe Met Pro Glu Leu Val Val Arg Gly Ser Thr Ala Ala Gly Pro Arg 
325 330 335 

Gly Gly Arg Arg Pro Ala His Gly Arg 
340 345 
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(2) INFORMATION FOR SEQ ID NO.: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 393 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(ix) FEATURES: 

(A) NAME/KEY: acbF 

(B) LOCATION: 1..393 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 13: 
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Met Arg Arg Gly lie Ala Ala Thr Ala Leu Phe Ala Ala Val Ala Met 
1 5 10 is 

Thr Ala Ser Ala Cys Gly Gly Gly Asp Asn Gly Gly Ser Gly Thr Asp 
20 25 30 

Ala Gly Gly Thr Glu Leu Ser Gly Thr Val Thr Phe Trp Asp Thr Ser 

35 40 45 

Asn Glu Ala Glu Lys Ala Thr Tyr Gin Ala Leu Ala Glu Gly Phe Glu 
50 55 60 

Lys Glu His Pro Lys Val Asp Val Lys Tyr Val Asn Val Pro Phe Gly 
65 70 75 80 

Glu Ala Asn Ala Lys Phe hys Asn Ala Ala Gly Gly Asn Ser Gly Ala 
85 90 95 



Pro 


Asp 


Val Met Arg Thr 
100 


Glu 


Val 


Ala 
105 


Trp 


Val 


Ala Asp Phe Ala Ser 
110 


lie Gly Tyr Leu Ala Pro 
115 


Leu 


Asp 
120 


Gly 


Thr 


Pro 


Ala Leu Asp Asp Gly 

125 


Ser 


Asp His Leu Pro Gin 
130 


Gly Gly 
135 


Ser 


Thr 


Arg 


Tyr Glu Gly Lys Thr 
140 


Tyr 
145 


Ala 


Val Pro Gin .Val 
150 


He 


Asp 


Thr 


Leu 


Ala 
155 


Leu Phe Tyr Asn Lys 
ISO 


Glu 


Leu 


Leu Thr Lys Ala 
165 


Gly Val 


Glu 


Val 

170 


Pro 


Gly Ser Leu Ala Glu 
175 


Leu 


Lys 


Thr Ala Ala Ala 
180 


Glu 


lie 


Thr 
185 


Glu 


Lys 


Thr Gly Ala Ser Gly 
190 


Leu 


Tyr 


Cys Gly Ala Thr 
195 


Thr 


Arg 
200 


Thr 


Trp 


Phe 


Leu Pro Tyr Leu Tyr 
205 


Gly 


Glu 
210 


Gly Gly Asp Leu Val 
215 


Asp 


Glu 


Lys 


Asn 


Lys Thr Val Thr Val 
220 


Asp Asp Glu Ala Gly Val 
225 230 


Arg 


Ala 


Tyr 


Arg 


Val 
235 


He Lys Asp Leu Val 
240 


Asp 


Ser 


Lys Ala Ala He 
245 


Thr 


Asp 


Ala 


Ser 
250 


Asp 


Gly Trp Asn Asn Met 
255 


Gin 


Asn 


Ala Phe Lys Ser 
260 


Gly 


Lys 


Val 
265 


Ala 


Met 


Met Val Asn Gly Pro 
270 


Trp Ala 


He Glu Asp Val 
275 


Lys 


Ala 
280 


Gly Ala Arg 


Phe Lys Asp Ala Gly 
285 


Asn 


Leu 
290 


Gly Val Ala Pro 


val 
295 


Pro 


Ala 


Gly Ser 


Ala Gly Gin Gly Ser 
300 


Pro 

i n c 


Gin 


Gly Gly Trp Asn Leu 


Ser 


Val 


Tyr 


Ala 


Gly Ser Lys Asn Leu 



310 315 320 
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Asp Ala Ser Tyr Ala Phe Val Lys Tyr Met Ser Ser Ala Lys Val Gin 
325 330 335 

Gin Gin Thr Thr Glu Lys Leu Ser Leu Leu Pro Thr Arg Thr Ser Val 
340 34S 350 

Tyr Glu Val Pro Ser Val Ala Asp Asn Glu Met Val Lys Pfae Phe Lys 
355 360 365 

Pro Ala Val Asp Lys Ala Val Glu Arg Pro Trp lie Ala Glu Gly Asn 
370 375 380 

Ala Leu Phe Glu Pro He Arg Leu Gin 
385 390 



